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EGF RECEPTOR AGONISTS AND ANTAGONISTS 

Field of the Invention 

This invention relates to the field of epidermal growth factor (EGF) 
receptor structure and EGF receptor/ligand interactions. In particular, it 
5 relates to the field of using the EGF receptor structure to select and screen for 
agonists and antagonists of the polypeptide ligands. 
Background of the Invention 

Epidermal growth factor is a small polj^eptide cytokine that 
stimulates marked proliferation of epithelial tissues and is a member of a 

10 larger family of structurally related cytokines such as transforming growth 
factor a (TGFa), amphiregulin, betacellulin, heparin-binding EGF and some 
viral gene products. Abnormal EGF family signalling is a characteristic of 
certain cancers (Soler, C. & Carpenter, G., 1994 In Nicola, N. (ed) Guidebook 
to Cytokines and Their receptors", Oxford Univ. Press, Oxford, ppl94-197; 

15 Walker, F. & Burgess, A. W., 1994, In Nicola, N. (ed) Guidebook to Cytokines 
and Their receptors", Oxford Univ. Press, Oxford, ppl98-201). 

The epidermal growth factor receptor (EGFR) is the cell membrane 
receptor for EGF (Ullrich, A., and Schlessinger, J. (1990) Cell 61, 203-212). 
The EGFR also binds other ligands that contain amino acid sequences 
'20 classified as the EGF-like motif. Among these ligands, the three-dimensional 
structures of EGF and TGFa have been determined by NMR (Montelione, 
G.T.; Wuthrich, K.; Nice, E.G., Burgess, A.W. and Scheraga, H.A. (1986) 
PNAS 83(22): 8594-8; Campbell, I.D., Cooke, R.M., Baron, M., Harvey, T.S., 
and Tappin, M.J. (1989) Prog. Growth Factor Res. 1, 13-22). Upon binding of 

25 the ligand to the extracellular domain, the EGFR undergoes dimerization, 
which eventually leads to the activation of its cytoplasmic protein tyrosine 
kinase (Ullrich, A., and Schlessinger. J. (1990) Cell 61, 203-212). The EGFR is 
also known as the ErbB-1 receptor and belongs to the type I family of receptor 
tyrosine kinases (Ullrich, A., and Schlessinger, J. (1990) Cell 61, 203-212). 

30 This group also includes the ErbB-2, ErbB-3 and ErbB-4 receptors. The ligand 
of ErbB-2 is still unknown but it is clear that heregulin is binding to ErbB-3 
and ErbB-4 (Plowman, G.D., Green, J.M., Calouscou, J.M., Carlton, G.W., 
Rothwell, V.M., and Buckley, S. (1993) Nature 366, 473-475). One of the 
heregulins is known as neuregulin or NDF and contains an EGF-like 

35 sequence that was found to fold into an EGF-like fold by NMR (Nagata, K., 
Kohda, D., Hatanska, H., Ichikawa, S., Matsuda, S., Yamamoto, T., Suzuki, 



A., and Inagaki, F. (1994) EMBO /. 13, 3517-3523 and Jacobson, N.E., Abadl, 
N., Sliwkowski, M.X., Reilly, D., Skelton, NJ., and Fairbrother, WJ. (1996) 
Biochemistry 36, 3402-3417). 

The type II family of receptor tyrosine kinases consists of the insulin 
receptor (INSR), the insulin-like growth factor I receptor, and the insulin 
receptor-related receptor (Ullrich, A., and Schlessinger, J. (1990) Cell 61, 203- 
212), Although the type II receptors consist of four chains (a2P2)» both the 
extracellular portions of the receptors from the two families, as well as the 
tyrosine kinase portions, share significant sequence homology, suggesting a 
common evolutionary origin (Ullrich, A., and Schlessinger, J. (1990) Cell 61, 
203-212, and Bajaj, M., Waterfield, M.D., Schlessinger, J., Taylor, W.R., and 
Blundell, T. (1987) Biochim. Biophys, Acta 916, 220-226). 

The 621 amino acid residues of the extracellular domain of the human 
EGFR (sEGFR) can be subdivided into four domains as follows: Ll, Si, L2 
and S2, where L and S stand for "large" and "small" domains, respectively 
(Bajaj, M., Waterfield, M.D., Schlessinger, J., Taylor, W.R., and Blundell, T. 
(1987) Biochim. Biophys. Acta 916, 220-226, see Fig. 2). The Ll and L2 
domains are homologous, as are the Si and S2 domains. 

Ligand-induced dimerization was first reported for the EGF receptor 
(Schlessinger, J. (1980) Trends Biochem Sci 13, 443-447) and now is widely 
accepted as a general mechanism for the transmission of growth stimulatory 
signals across the cell membrane. Although many biochemical experiments 
have been performed to reveal the molecular mechanism of receptor 
dimerization (Lemmon, M.A., Bu, Z., Ladbury, J.E., Zhou, M., Pinchasi, D., 
Lax, L., Engelman, D.M., and Schlessinger, J. (1997) EMBO J. 16, 281-294 and 
Tzabar, E., Pinkas-Kramarski, R., Moyer, J.D., Klapper, D.N., Alroy, L., 
Levkowitz, G., Shelly, M., Henis, S., Eisenstein, M., Ratzkin, B.J., Sela, M., 
Andrews, G.C., and Yarden, Y. (1997) EMBO /. 16, 4938-4950 and Lax, L., 
Mitra, A.K., Ravern, C., Hurwitz, D.R., Rubinstein, M., Ullrich, A., Stroud, 
R.M., and Schlessinger, J. (1991),/. Biol. Chem. 266, 13828-13833), the 
molecular mechanism by which monomeric ligands induce dimerization is 
still unknown for members of the EGFR family. Single particle averaging of 
electron microscopic images suggests that the overall shape of the sEGFR is 
four-lobed and doughnut-like (Lax, L., Mitra, A.K., Ravern, C., Hurwitz, D.R., 
Rubinstein, M., Ullrich, A., Stroud, R.M., and Schlessinger, J. (1991),/. Biol. 
Chem, 266, 13828-13833). Small angle x-ray scattering also indicate that the 
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sEGFR is a flattened sphere with long diameters of 110 A and a short 
diameter of 20 A (Lemmon, M.A., Bu, Z., Ladbury, J.E., Zhou, M., Pinchasi, 

D. , Lax, L., Engelman, D.M., and Schlessinger, J. (1997) EMBO J. 16, 281-294). 
The crystallization of sEGFR in complex with EGF has been published 

5 (Giinther, N., Betzel, C., .and Weber, W. (1990) /. Biol. Chem. 265, 22082- 

22085), but the structure has not yet been reported, despite a decade of effort 
by many groups. 

The EGF receptor ligand, TGF-a has been observed to be overproduced 
in keratinocyte cells which are subject to psoriasis (Turbitt, M.L. et al., 1990, 
10 /. Invest Dermatol 95(2), 229-232; Higashimyama, M. et al., 1991, /. 
Dermatol, 18(2), 117-119; Elder, J.T. et al, 1990, 94(1), 19-25). The 
overproduction of at least one other EGF receptor ligand, amphiregulin, has 
also been implicated in psoriasis. (Piepkorn, M. 1996, Am. /. Dermatopath,, 
18(2), 165-171). Molecules that inhibit the EGF receptor have been shown to 
15 inhibit the proliferation of both normal keratinocytes (Dvir, A. et al, 1991,/, 
. Cell Biol, 113(4), 857-865) and psoriatic keratinocytes. (Ben-Bassat, H. et al., 
1995, Exp. Dermatol, 4(2), 82-88). These findings indicate that EGF receptor 
antagonists may be useful in the treatment of psoriasis. 

Many cancer cells express constitutively active EGFR (Sandgreen, E. 
20 P., et al., 1990, Cell, 61:1121-135; Karnes, W. E. J., et al., 1992, 

Gastroenterology, 102:474-485) or other EGFR family members (Hynes, N. 

E. ,1993, Semin. Cancer Biol. 4:19-26). Elevated levels of activated EGFR 
occur in bladder, breast, lung and brain tumours (Harris, A. L., et al., 1989, In 
Furth & Greaves (eds) The Molecular Diagnostics of human cancer. Cold 

25 Spring Harbor Lab. Press, CSH, NY, pp353-357). Antibodies to EGFR can 
inhibit ligand activation of EGFR (Sato, J. D., et al., 1983 Mol. Biol. Med. 
1:511-529) and the growth of many epithelial cell lines (Aboud-Pirak E., et 
al., 1988, J. Natl Cancer Inst. 85:1327-1331). Patients receiving repeated doses 
of a humanised chimeric anti-EGFR monoclonal antibody (Mab) showed 

30 signs of disease stabilization. The large doses required and the cost of 

production of humanised Mab is likely to limit the application of this type of 
therapy. These findings indicate that the development of EGF receptor 
antagonists will be attractive anticancer agents. 
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Summary of the Invention 

The present inventors have now obtained 3-dimensional structural 
information concerning the epidermal growth factor receptor (EGFR). This 
structural information was obtained by comparative modelling based on the 
5 3D structure of the IGF-1 receptor as described in PP0585 and PP2598 (a copy 
of which is annexed hereto as Annexure A). The information presented in 
the present application provides the opportunity for the development of 
specific antagonists and agonists of EGFR for therapeutic applications. 

Accordingly, in a first aspect the present invention provides a method 
10 of screening for, or designing, an agonist of the EGF receptor which method 
includes 

(i) selecting or designing a substance which possesses 
stereochemical complementarity to the EGF receptor site, wherein the 
receptor site is characterised by 

15 (a) amino acids 1-474 of the EGF receptor positioned at atomic 

coordinates substantially as shown in Figure 6 and 7 or a subset thereof; and 

(ii) testing the substance for the ability to act as an agonist of the EGF 
receptor. 

In a second aspect the present invention provides a method of 
20 screening for, or designing, an antagonist of the EGF receptor which method 
includes 

(i) selecting or designing a substance which possesses 
stereochemical complementarity to the EGF receptor site, wherein the 
receptor site is characterised by 

25 (a) amino acids 1-474 of the EGF receptor positioned at atomic 

coordinates substantially as shown in Figures 6 and 7 or a subset thereof; and 

(ii) testing the substance for the ability to act as an antagonist of the 
EGF receptor. 

The EGF receptor site defined in the first and second aspects of the 
30 present invention comprises the LI, Si and L2 domains (residues 1-474) of 
the ectodomain of EGFR. At the centre of this structure is a cavity, bounded 
by all three domains, of sufficient size to accommodate a ligand molecule. 
By "stereochemical complementarity" we mean that the substance or a 
portion thereof correlates, in the manner of the classic "lock-and-key 
35 visualisation of ligand-receptor interaction, with the cavity in the receptor 
site. Preferably, the stereochemical complementarity is such that the 



substance has a Kj for the receptor site of less than 10"®M. More preferably, 
the Ki value is less than lO'^M and more preferably less than lO'^M. 

In preferred embodiments of the first and second aspects of the present 
invention, the method further involves selecting or designing a substance 
5 which has portions that match residues positioned on the surface of the 

receptor site which faces the cavity. By "match" we mean that the identified 
portions interact with the surface residues, for example, via hydrogen 
bonding or by enthalpy-reducing Van der Waals interactions which promote 
desolvation of the biologically active substance within the site, in such a way 

10 that retention of the substance within the cavity is favoured energetically. 

In a preferred embodiment of the first aspect of the present invention, 
the method includes screening for, or designing, a substance which possesses 
a stereochemistry and/or geometry which allows it to interact with both the 
LI and L2 domains of the EGF receptor site. It is believed that EGFR 

15 monomers dime rise in nature in such a manner that the cavities of each 

monomer may face each other. Accordingly, the method of the first aspect of 
the present invention may involve screening for, or designing, a biologically 
active substance which interacts with the Ll domain of one monomer and 
the L2 domain of the other monomer, 

20 In a third aspect the present invention provides a method of selecting 

or designing an agonist of the EGF receptor which method includes 

(i) selecting or designing a substance which interacts with 

(a) a fragment of the EGF receptor characterised by amino acids 
1-474 positioned at atomic coordinates substantially as shown in Figures 6 
25 and 7 or a subset thereof; 

wherein the interaction of the substance with the fragment alters the 
position of at least one of the Ll, L2 or Si domains of the fragment relative to 
the position of at least one of the other domains; and 

(ii) testing the substance for the ability to act as an agonist of the EGF 
30 receptor. 

In a preferred embodiment of the third aspect of the present invention 
the substance interacts with the fragment in the region of the Ll domain-Sl 
domain interface, causing the Ll and Si domains to move away from each 
other. In a further preferred embodiment the substance interacts with the 
35 hinge region between the L2 domain and the Si domain causing an alteration 
in the positions of the domains relative to each other. In a further preferred 



embodiment the substance interacts with the P sheet of the Ll domain 
causing an alteration in the position of the Ll domain relative to the position 
of the Si domain or L2 domain. 

In a fourth aspect the present invention provides an agonist of the EGF 
receptor obtained by a method according to the first or third aspects of the 
present invention. 

In a fifth aspect the present invention provides an antagonist of the 
EGF receptor obtained by a method according to the second aspect of the 
present invention. 

The agonists or antagonists of the fourth and fifth aspects of the 
present invention may be mutant EGFR ligands where at least one mutation 
occurs in the region of the ligand which interacts with residues on the 
surface of the receptor site facing toward the cavity. For example, the 
residues Arg 41 and Tyr 13 in EGF are conserved in other members of the 
EGF receptor family of ligands (a Phe residue may be substituted for Tyr 13). 
Structures of several EGF family members show the two residues to be in 
close proximity. This portion of EGF may interact with a hydrophobic 
portion of the EGF receptor which contains one or more negatively charged 
residues such as the lower P sheet of the Ll domain. Mutants of EGF which 
show altered activity may be generated by introducing modifications to Arg 
41 or Tyr 13 or other nearby residues. Alternatively, mutants of EGF may be 
generated by introducing modifications to residues on the opposite side of 
the ligand which may interact with a second receptor molecule in the 
unmodified ligand. 

In a sixth aspect the present invention provides a substance which 
possesses stereochemical complementarity to the EGF receptor site, wherein 
the receptor site is characterised by 

(a) amino acids 1-474 of the EGF receptor positioned at atomic 
coordinates substantially as shown in Figures 6 and 7 or a subset thereof; 

with the proviso that the substance is not a naturally occurring ligand 
of the EGF receptor or a mutant thereof. 

By "mutant" we mean a ligand which has been modified by one or 
more point mutations, insertions of amino acids or deletions of amino acids. 

In a preferred embodiment of the sixth aspect of the present invention, 
the stereochemical complementarity is such that the compound has a for 



the receptor site of less than 10'®M. More preferably, the Kj value is less than 
lO'^M and more preferably less than lO'^M. 

The 3 dimensional structure of the EGF receptor elucidated by the 
present inventors also shows that the L2 and S2 domains are positioned such 
5 that they form a "corner" structure. It is envisaged that this corner structure 
provides a further binding site for ligands of the EGF receptor. 

Accordingly, in a seventh aspect the present invention provides a 
method of screening for, or designing, an agonist of the EGF receptor which 
method includes 

10 (i) selecting or designing a substance which binds simultaneously to 

the L2 and S2 domains of the EGF receptor, wherein the L2 and S2 domains 
are positioned substantially according to the atomic coordinates of amino 
acids 313-621 as shown in Figure 7, and 

(ii) testing the substance for the ability to act as an agonist of the EGF 

15 receptor. 

In an eighth aspect the present invention provides a method of 
screening for, or designing, an antagonist of the EGF receptor which method 
includes 

(i) selecting or designing a substance which binds simultaneously to 
20 the L2 and S2 domains of the EGF receptor, wherein the L2 and S2 domains 

are positioned substantially according to the atomic coordinates of amino 
acids 313-621 as shown in Figure 7, and 

(ii) testing the substance for the ability to act as an antagonist of the 
EGF receptor. 

25 In preferred embodiments of the seventh and eighth aspects of the 

present invention, the method involves selecting or designing a substance 
which has portions that match residues positioned on the inner surface of the 
corner structure. By "match" we mean that the identified portions interact 
with the surface residues, for example, via hydrogen bonding or by enthalpy- 

30 reducing Van der Waals interactions in such a way that retention of the 
substance within the corner structure is favoured energetically. 

Preferably, the substance matches the residues positioned on the inner 
surface such that the substance has a Ki for the corner structure of less than 
lO'^M. More preferably, the K, value is less than lO'^M and more preferably 

35 less than lO'^M. 



In a ninth aspect the present invention provides a method of selecting 
or designing an agonist of the EGF receptor which method includes 

(i) selecting or designing a substance which interacts with 

(a) a fragment of the EGF receptor characterised by amino acids 
313-621 positioned at atomic coordinates substantially as shown in Figure 7 
or a subset thereof; 

wherein the interaction of the substance with the fragment alters the 
relative positions of the L2 and S2 domains of the fragment with respect to 
each other; and 

(ii) testing the substance for the ability to act as an agonist of the EGF 
receptor. 

In a tenth aspect the present invention provides an agonist of the EGF 
receptor obtained by a method according to the seventh or ninth aspects of 
the present invention. 

In an eleventh aspect the present invention provides an antagonist of 
the EGF receptor obtained by a method according to the eighth aspect of the 
present invention. 

In a twelfth aspect the present invention provides a pharmaceutical 
composition for preventing or treating a disease which would benefit from 
increased signalling by the EGF receptor, which includes an agonist obtained 
by a method according to the first, third, seventh or ninth aspects of the 
present invention and a pharmaceutically acceptable carrier or diluent. 

In an thirteenth aspect the present invention provides a 
pharmaceutical composition for preventing or treating a disease associated 
with signalling by the EGF receptor which includes an antagonist obtained 
by a method according to the second or eighth aspects of the present 
invention and a pharmaceutically acceptable carrier or diluent. 

In a fourteenth aspect the present invention provides a method of 
preventing or treating a disease which would benefit from increased 
signalling by the EGF receptor which method includes administering to a 
subject in need thereof an agonist obtained by a method according to the 
first, third, seventh or ninth aspects of the present invention. 

Diseases which may be treated by administration of EGFR agonists 
include wound healing and gastric ulcers. 

In a fifteenth aspect the present invention provides a method of 
preventing or treating a disease associated with signalling by the EGF 
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receptor which method includes administering to a subject in need thereof an 
antagonist obtained by a method according to the second or eighth aspects of 
the present invention. 

Diseases associated with signalling by the EGF receptor include 
psoriasis and many types of tumour states including but not restricted to 
cancer of the breast, brain, ovary, cervix, pancreas, lung, head and neck, and 
melanoma, rhabdomyosarcoma, mesothelioma and glioblastoma. 

Brief Description of the Drawings 

Figure 1: Sequence alignment of human EGF receptor family proteins with 
IGF-1 receptor sequences and insulin receptor sequence for the first two 
domains of the EGF receptor. The alignment of EGF receptor and the various 
IGF-1 receptor sequences were used by the MODELLER program to create a 
model of the EGF receptor domains Ll and Si. Residues which are 
underlined were used to create additional Ga-Ca restraints for the 
construction of the EGF receptor model. IGF-1 receptor residues colored in 
magenta form part of helical secondary structures. Residues colored in light 
blue, light green and dark yellow reside in one of the three P-sheets (colored 
light blue, light green and dark yellow respectively) which make up part of 
the Ll p-helix. Residues colored in dark blue and dark green form part of ap- 
strand in the p-fingers. The residues in red are also in p-strands. Each 
cysteine residue in the Si domain are numbered according to the module 
that it is a part of. 

Figure 2: Sequence alignment of human EGF receptor family proteins with 
IGF-1 receptor sequences and insulin receptor sequence for the third and 
fourth domains of the EGF receptor. The labelling scheme of the residues is 
the same as for Figure 1. 

Figure 3: Model polypeptide fold of the Ll and Si domains of the EGF 
receptor. The Ll domain is at the left hand side of the structure with the N- 
terminus facing the front. The secondary structure elements are coloured in 
the same manner as in Figure 1. 
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Figure 4: Model polypeptide fold of the L2 and S2 domains of the EGF 
receptor. The L2 domain is at the bottom with its N-terminus facing the front. 
The secondary structure elements are coloured in the same manner as in 
Figure 1. 

5 

Figure 5: Superpostion of the two models (of LI and Si domains and of L2 
and S2 domains) onto structure of first three domains of IGF-1 receptor. The 
residues have been colored according to an estimate of the accuracy of the 
model cooridinates. Residues colored in yellow are judged to be well- 
10 modelled. Residues colored in orange are judged to have a moderate 

possibility of error. The coordinates or residues colored in red are believed to 
be inaccurate. 

Figure 6: Coordinates of the model of the EGF receptor domains LI and Si. 
15 The coordinates are in relation to a Cartesian set of orthogonal axes. The 
final column contains the number 20, 40 or 60 depending on whether the 
residue containing the atom is judged to be well modelled, have a moderate 
possibility of error or is believed to be inaccurate respectively. 

20 Figure 7: Coordinates of the model of the EGF receptor domains L2 and S2. 
The coordinates are in relation to a Cartesian set of orthogonal axes which 
are independent of the coordinate frame used for the EGF receptor model for 
LI and Si domains. The number in the final column is assigned in the same 
manner as for Figure 6. 

25 

Detailed description of the Invention 

Comparative modelling 

The comparative modelling method exploits the observation that 

30 proteins with more than 25% amino acid identity will almost always have a 
similar protein backbone (Sander, C. And Schneider, R., 1991, Proteins: 
Structure Function and Genetics, 9, 56-68). In some cases, proteins will have 
similar backbone structures with a lower proportion of identical amino acids. 
By aligning the sequence of a (target) protein which is to be modelled with 

35 the sequences with known structures (the templates), a model of the protein 
can be obtained. Where a region of the target sequence follows the sequences 



of a template, the backbone of the target is built to follow that of the 
template. Where the target sequence can not be aligned to a target sequence, 
the so-called insertion must be constructed by other means (Greer, J., 1991, 
Meth. Enzym. pp 239-252]. 

The MODELLER program (§aH, A and Blundell, T.L., 1993, J. Mol. 
Biol. 234, 779-815) is a semi-automated approach to building models of 
proteins given the structures of one or more template structures and an 
alignment between the sequences of the target protein and the templates. 
Based on the sequence alignment and a set of rules derived from the analysis 
of sets of aligned structure, the program generates a series of restraints for 
variables such as Ca-Ca distances, main chain and side chain dihedral angles 
for the target structure. The restraints are expressed in terms of probability 
density functions (PDFs). The PDFs are combined to yield an expression for 
the most probable structure as a function of the variables (Ca-Ca distances 
etc). The program then attempts to find structures to maximise the value of 
this function. In effect, the program attempts to minimise a transformed 
version of this function. 

While some comparative modelling approaches involve the explicit 
building of regions of the model for which there is no sequence alignment 
with a template, the MODELLER program constructs PDFs for these regions, 
thus including them in the consideration of constructing a comparative 
model. It is conceivable that once a comparative model has been constructed 
using MODELLER that an algorithm to build the structures of these regions is 
applied. 

The MODELLER program was used to build the structures of the 
extracellular portion of the EOF receptor using the 3D structure of the IGF-1 
receptor (as described in PP0585 and PP2598) as a template. The description 
of the generation of these models is outlined below. 
Construction of the alignment 

The sequence of the EGF receptor extracellular domain can be divided 
into four domains, LI, Si, L2 and S2 on the basis of internal homology and 
homology with the insulin receptor family (Ward, C.W. et al., 1995, Proteins: 
Structure Function and Genetics 22: 141-153; Bajaj, M. et al., 1987, Biochim. 
Biophys. Acta 916: 220-226). At least two important sequence motifs are 
found in the EGF receptor sequence which are conserved in other EGF 
receptor homologues. The first motif is the sequence CXXXXXXW which is 
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found towards the end of both LI and L2 of EGFR (C is cysteine, W is 
tiyptophan and X is any residue). The second motif is the sequence CW 
where C is the third cysteine of both Si and S2 (using the assignment of 
domain boundaries from Ward, C.W. et al, 1995, Proteins: Structure 
5 Function and Genetics 22: 141-153). The first motif is found in LI but not L2 
of the insuHn receptor family. The second motif is found in the cysteine-rich 
domain of the insulin receptor family. These motifs are found in LI and the 
cysteine-rich domain of the insulin receptor family. Structurally, the first 
motif corresponds to part of the Ll domain which allows penetration of the 

10 tiyptophan residue of the second motif into the p-helix. As the first sequence 
motif is absent from L2 of the IGF-1 receptor, only the Ll and cysteine rich 
domains of the IGF-1 receptor were used as templates for the building of the 
EGF receptor extracellular domain models. 
Construction of the alignment of Ll and Si 

15 Thefe are two loops in the structure of the Ll domain which emerge 

from the breadloaf structure. The second loop (residues 86-93 in EGFR Ll, 
79-85 in IGF-lR Ll) is structurally conserved in the L2 domain and differs by 
one amino acid residue in length. A region of the L2 domain corresponding 
to the loop was used as an additional template for this region. The sequence 

20 of the EGF receptor which corresponds to the first loop is of a different length 
and does not seem to be consistent with the loop of the IGF-1 receptor. The 
latter half of the region of EGF receptor sequence can be aligned to a region 
of sequence in the IGF-1 receptor's L2 domain. A portion of the IGF-1 
receptor structure corresponding to this region of sequence plus the structure 

25 of flanking sequences was used as an additional template. 

The alignment of the Si domain of the EGF receptor to the IGF-1 
receptor used the same combination of modules but involved the use of other 
modules from the cysteine-rich domain as additional templates. The first 
and second modules of the EGF module used the third module of the IGF-1 

30 receptor cysteine-rich domain as additional templates. (This module contains 
two cysteines in disulfide bonds in a 1-3, 2-4 arrangement.) The sixth 
module of the EGF receptor can be modelled by the fifth module of the IGF-1 
receptor, a p-finger. 

Construction of the alignment of L2 and S2 
35 The alignment of the EGF receptor sequence for the L2 domain to the 

Ll domain of the IGF-1 receptor sequence was similar to that of the Ll 
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alignment. There is a 16 amino acid region which occurs roughly in the same 
region as the first loop in the IGF-1 receptor Ll domain. This region of 
sequence, which exhibits sequence homology amongst the EGF receptor 
family of proteins, can not be aligned with any region of the IGF-1 receptor 
sequence. 

The sequence of the S2 domain was found to differ significantly from 
the Si domain and suggested that the pattern of disulfide bonds may be 
different. 

An analysis of the p-finger structures in the IGF-1 receptor, TNF 
receptor and laminin^ structures revealed that the p-fingers could be classed 
into three types exhibiting some structural and sequence conservation. Two 
of the structural types are relevent to the IGF-1 and EGF receptors. The first 
type of p-finger is characterised by structural conservation of the C-terminal 
portion of the module and also of the linker region after the module. The 
sequence signature is C...CXXC where the third cysteine residue is the start 
of another p-finger module. The second type of p-finger is characterised by 
structural conservation of the N-terminal portion of the module and also of 
the linker region after the module. The sequence signature is C...CXXXC 
where the third cysteine is the start of a module whose disufide bonding 
pattern is 1-3,2-4. The fifth module of the IGF-1 receptor cysteine-rich 
domain has some structural conservation with both types of p-finger. 

The regions of the IGF receptor structure which were used as templates 
were identified as follows. The structure of IGF-1 receptor from the start of 
the Ll domain to the end of the first module of the cysteine-rich domain 
(which contains the conserved tryptophan residue which intercalates into the 
Ll p-helix) was used to model the corresponding regions of L2 and the start 
of S2 of the EGF receptor. Additional templates were used and "joined" to 
other templates by virtue of overlap in the sequence alignment. 

The fourth and fifth modules of the IGF-1 receptor cysteine-rich 
domain were found to align with the sequences of the first and second and 
also the fourth and fifth putative modules of the S2 domain. The seventh 
module is the second last module of the S2 domain. The eighth module is 
neither a P-finger nor a module with the 1-3, 2-4 pattern of disulfide bonds. 
By elimination and use of the information described in the preceding 
paragraph, the third and sixth modules were assigned to be p-fingers of the 
second type. Two parts of the IGF-1 receptor structure were used to model 
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these two P-fingers. The fifth and seventh modules were used to model the p- 
finger modules. The linker region after the seventh module was also used. 
Additional residues after the linker were included to guide the placement of 
the next module. The positioning of the next module (modules 4 and 7 in S2) 
5 is essentially arbitrary and the use the extra residues offers a way of 
obtaining a plausible placement of the module. 
Construction of the model 

Version 3 of the MODELLER program (Modeler User Guide, October 
1996, San Diego Molecular Simulations Inc) was used to build models of the 

10 EGF receptor. Models of the LI and SI domains were constructed from the 
alignment shown in Figure 1 using the IGF-1 receptor templates shown and 
the EGF receptor sequence. Additional distance restraints were generated 
between Ca atoms of selected residues. The restraints were generated as 
follows. The small IGF-1 receptor templates were superimposed into the 

15 structure of the first two domains of the IGF-1 receptor using the Ca atoms of 
the residues which are aligned in Figure 1. Using the Homology module of 
the Insight II program (Homology User Guide, October 1995, San Diego 
BIOSYM/MSI) coordinates were built for the EGF receptor residues which are 
aligned to the IGF-1 receptor coordinates which are in bold typeface. From 

20 these coordinates, distance restraints in the form of Gaussian curves were 
constructed for pairs of Ca atoms with a distance less than 50 A. The sigma 
value of the Gaussian curves was set to be 2A, A MODELLER run was 
submitted using the alignment in Figure 1. The built models of proteins 
attempt to satisfy these restraints in addition to the restraints the program 

25 derives from the alignment. 

To build models of the L2 and S2 domains, a similar process to that 
described in the preceding paragraph was used. The alignment used to build 
the models is shown in Figure 2. Two separate sets of additional restraints 
were used. The first set of restraints were derived from the IGF-1 receptor 

30 templates which are aligned with the first, second and third modules of the 
EGF receptor S2 domain. The second set of restraints were derived from the 
IGF-1 receptor templates which were aligned with the fourth, fifith and sixth 
modules of the EGF receptor S2 domain. Only those residues which are 
underlined in Figure 2 were used to generate the restraints. The sigma value 

35 of the Gaussian curves used to construct the additional restraints was lA. 
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For both sets of models, the MODELLER program constructed 20 
models whose coordinates were perturbed from an initial structure by a 
random value of maximum distance 4A. The refinement level used was the 
'refinel' option in the MODELLER program. 
Structure of the EGF receptor model 

The structure of the Ll and Si domains of the EGF receptor as 
determined by the modelling described above is shown in Figure 3, while the 
structure of the L2 and S2 domains is shown in Figure 4. The superposition 
of these two models onto the structure of the extracellular domains of the 
IGF-1 receptor is shown in Figure 5. 

The coordinates of the EGF receptor domains Ll and Si are shown in 
Figure 6. The coordinates of the EGF receptor domains L2 and S2 are shown 
in Figure 7. 

The structures of the Ll and Si domains are similar to those of the 
IGF-1 receptor structure, as expected. There are two major differences in the 
Si domain from the structure of the cysteine -rich region of the IGF-1 receptor 
structure. The sixth module of S2 is smaller that of the IGF-1 receptor and 
occupies less of the region between the two L domains. The fifth module, 
another p-finger, contains a large insertion which points away from the Ll 
domain. The structure of the end of the EGF receptor Si domain is similar to 
that of the IGF-1 receptor cysteine-rich domain and is postulated to contain a 
hinge region between the last module of the Si domain and the L2 domain. 

A region of EGF receptor in L2 which could not be aligned with the 
IGF-1 receptor sequence includes the amino acids Trp-Pro which are 
conserved in the EGF receptor family of structure. This sequence motif is not 
found in the insulin receptor family and may represent a region of novel 
structure. This region of sequence could not be modelled on the 
corresponding region of the IGF-1 structure since none of the amino acids of 
the sequence Glu-Asn-Arg could be placed such that their side chains are in 
the interior of the P-helix. The asparagine has been observed to be 
glycosylated (Smith, K.D. et al, 1996, Grovirth.Factors, 13(1-2), 121-132) and 
therefore must point out of the structure. The charged residues glutamate and 
arginine are also expected to point out from the p-helix. 

The amino acids 352-367 correspond to a large insertion in the third 
domain of the EGF receptor. The amino acids 351-364 have been identified 
as the epitope for several antibodies against the EGF receptor (Wu, D.G et al. 
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J, Biol, Chem. 1989 264(29):17469-17475). That this region forms a loop 
which sticks out of the structure is consistent with this region being 
accessible to the antibodies. The structure itself is difficult to model 
accurately since its sequence does not correspond to any part of the IGF-1 
5 receptor sequence. The position of this insertion is in approximately the 
same region where the structures of IGF-1 receptor Ll and L2 domain differ. 

The S2 domain adopts a different shape to the Si domain. The S2 
domain adopts a rod-like shape similar to that of the lamininy-chain 
(Stetefeld, J. et al., 1996, J. Mol. BioL, 257(3): 644-657). Like the first half of 

10 the receptor model, the S2 domain contacts the L2 domain with the first 

module (this module contains the conserved tryptophan which intercalates 
into the breadloaf). Unlike Si, the rest of the S2 domain does not make any 
more contact with the L2 domain. The S2 domain points out from the L2 
domain with a different geometry to the manner in which the Si domain 

15 points out from Ll. 

Putative binding sites of the EOF receptor 

From the IGF-1 receptor structure and a number of insulin receptor 
mutants, one of the regions of insulin binding was proposed to be the lower p 
sheet of the Ll domain. This surface is characterised by a number of 

20 hydrophobic residues which point out of the structure and also the presence 
of a structurally conserved loop. By analogy, we propose that the analogous 
P sheets of the Ll and L2 are potential binding sites. These sheets contain a 
number of hydrophobic residues, conserved amongst EGF receptor family 
members, which point away from the core of the p-helix structure. Residue 

25 45 of a mutant EGF has been cross-linked to the residue Lysine 465 which is 
in the last strand of the lower p sheet of the L2 domain. (Summerfield, AE et 
al, J Biol Chem, 1996, 271(33), 19656-19659). Tyrosine 101 has been cross- 
linked to the N-terminus of EGF (Woltjer, RL et al, PNAS, 1992, 89(16), 7801- 
7805). This residue is in the portion of sequence which immediately follows 

30 a strand in the lower p sheet of Ll. 

The side chain of asparagine 1 of EGF has been cross-linked to lysine 
336 of the EGF receptor (Wu, DG et al, PNAS, 1990, 87(8), 3151-3155). The 
latter residue is in the N-terminal helix of the L2 domain and points towards 
the cavity which is formed when the two halves of the EGF receptor are 

35 postioned in a similar arrangement to the first three domains of the IGF-1 
receptor. Two nearby residues, Asn 328 and Asn 337 are glycosylated. This 



18 



mutation is in a similar position to the insulin receptor mutant S323L which 
has aberrent insulin binding. 

Several insertional mutants of the EGF receptor extracellular domain 
have been constructed to probe the role of several regions of the receptor 
(Harte, M.T. and Gentry, L.E., 1995, Arch. Biochem. Biophys. 322(2), 387- 
389), EGF receptor mutants with insertions at residues 162, 169, 174 and 
220 bound EGF with a similar affinity to wild-t5^e EGF receptor but bound 
TGF-a with a lower affinity than wild-type receptor. The first insertion was 
located in the region near the end of the LI domain and the first cysteine of 
the first module in Si. The second and third insertions were present in the 
first module of Si and the fourth insertion was present in the third module of 
SI. EGF receptor mutants with insertions at postions at 251 and 574 (both in 
large p-finger modules, the first in Si and the second in S2) bound twice as 
much EGF as the wild type receptor. Two insertional mutants which showed 
reduced EGF receptor binding contained insertions at postions 291 and 474. 
The former insertion is contained in the seventh module of Si which is a p- 
finger. The latter insertion is near the end of the L2 domain. 

Another EGF receptor mutant which shows altered ligand binding 
behaviour is the R497K mutant. The site of this mutation in the first module 
of the S2 domain and faces the side of the L2 domain opposite to that 
containing residue 465. This mutant binds EGF in a similar fashion as wild- 
type receptor but abolishes the high affinity binding site for TGF-a (Moriai, 
T. et al, 1994, PNAS 91(21), 10217-10221). 

It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shown in 
the specific embodiments without departing from the spirit or scope of the 
invention as broadly described. The present embodiments are, therefore, to 
be considered in all respects as illustrative and not restrictive. 

Dated this twenty ninth day of May 1998 

BIOMOLECULAR RESEARCH 
INSTITUTE LTD 
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Figure 7 
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PROVISIONAL SPECIFICATION 



Invention Title: 



EGF family receptor agonists and antagonists 



The invention is described in the following statement: 



EGF FAMILY RECEPTOR AGONISTS AND ANTAGONISTS 

Field of the Invention 

This invention relates to the field of receptor structure and 
receptorAigand interactions. In particular it relates to the field of using 
5 receptor structure to predict the structure of related receptors and to use the 
determined structures and predicted structures to select and screen for 
agonists and antagonists of the polypeptide ligands. 
Background of the Invention 

Insulin is the peptide hormone that regulates glucose uptake and 

10 metabolism. The two types of diabetes are associated with either an inability 
to produce insulin because of destruction of the pancreatic islet cells (Homo- 
Delarche, F. & Boitard, C.,1996, Immunol. Today 10: 456-460) or poor glucose 
metabolism resulting from either insulin resistance at the target tissues, 
inadequate insulin secretion by the islets or faulty liver function (Taylor, S. 

15 I., et al., 1994, Diabetes, 43: 735-740). 

Insulin-like growth factors-1 and 2 (IGF-1 and 2) are structurally 
related to insulin but are more important in tissue growth and development 
than in metabolism. They are primarily produced in the liver in response to 
growth hormone but are also produced in most other tissues where they 

20 function as paracrine/autocrine regulators. The IGFs are strong mitogens and 
are involved in numerous physiological states and certain cancers (Baserga, 
R., 1996, TibTech 14: 150-152). 

Epidermal growth factor (EGF) is a small polypeptide cytokine that is 
unrelated to the insulin/IGF family. It stimulates marked proliferation of 

25 epithelial tissues and is a member of a larger family of structurally related 
cytokines such as transforming growth factor a, amphiregulin, betacellulin, 
heparin-binding EGF and some viral gene products. Abnormal EGF family 
signalling is a characteristic of certain cancers (Soler, C. & Carpenter, G., 
1994 In Nicola, N. (ed) Guidebook to Cytokines and Their receptors", Oxford 

30 Univ. Press, Oxford, ppl94-197; Walker, F. & Burgess, A. W., 1994, In Nicola, 
N. (ed) Guidebook to Cytokines and Their receptors", Oxford Univ. Press, 
Oxford, ppl98-201). 

Each of these growth factors mediate their biological actions through 
binding to the corresponding receptor. The IR, IGF-lR and insulin receptor- 

35 related receptor (IRR), for which the ligand is not known, are closely related 
to each other and are referred to as the insulin receptor subfamily. There is a 



large body of information now available concerning the primary structure of 
these insulin receptor subfamily members (Ebina, Y., et al., 1985 Cell 40: 
747-758; Ullrich, A., et al., 1985, Nature 313: 756-761; Ullrich, A. et al., 
1986, EMBO J 5: 2503-2512; Shier, P. & Watt, V. M., 1989, J. Biol. Chem. 264: 
5 14605-14608) and the identification of some of their functional domains (for 
reviews see De Meyts, P. 1994, Diabetologia 37: 135-148; Lee, J, & Pilch, P. 
F. 1994 Amer. J. Physiol. 266: C319-C334.; Schaffer, L. 1994, Eur. J. Biochem. 
221: 1127-1132). IGF-IR, IR and IRR are members of the tyrosine kinase 
receptor superfamily and are closely related to the epidermal growth factor 

10 receptor (EGFR) subfamily, with which they share significant sequence 
identity in the extracellular region as well as in the C3rtoplasniic kinase 
domains (Ullrich, A. et al, 1984 Nature 309: 418-425; Ward, C. W. et al., 1995 
Proteins: Structure Function & Genetics 22: 141-153), Both the insulin and 
EGF receptor subfamilies have a similar arrangement of two homologous 

15 domains (LI and L2) separated by a cys-rich region of approximately 160 
amino acids containing 22-24 cys residues (Bajaj, M., et al., 1987 Biochim. 
Biophys. Acta 916: 220-226; Ward, C. W. et al., 1995 Proteins: Structure 
Function & Genetics 22: 141-153). The C-terminal portion of the IGF-lR 
ectodomain (residues 463 to 906) is comprised of four domains: a connecting 

20 domain, two fibronectin type 3 (Fn3) repeats, and an insert domain (O'Bryan, 
J. P,, et al., 1991 Mol Cell Biol 11: 5016-5031); the C-terminal portion of the 
EGFR ectodomain (residues 477-621) consists solely of a second cys-rich 
region containing 20 cys residues (Ullrich, A. et aL, 1984, Nature 309: 418- 
425). 

25 Little is known about the secondary, tertiary and quaternary structure 

of the ectodomains of these receptor subfamilies. Unlike the members of the 
EGFR subfamily which are transmembrane monomers which dimerise on 
binding ligand, the IR subfamily members are homodimers, held together by 
disulphide bonds. The extracellular region of the IRAGF-lR/IRR monomers 

30 contains an a-chain (— 703 to 735 amino acid residues) and 192-196 residues 
of the fi-chain. There is a —23 residue transmembrane segment, followed by 
the cytoplasmic portion (354 to 408 amino acids) which contains the 
catalytic tyrosine kinase domain flanked by juxtamembrane and C-tail 
regulatory regions and is responsible for mediating all receptor-specific 

35 functions (White, M F. & Kahn, C, R. 1994 J. Biol. Chem. 269: 1-4). Chemical 
analyses of the receptor suggest that the a-chains are linked to the B-chains 



via a single disulphide bond with the IR dimer being formed by at least two 
a-a disulphide linkages (Finn, F. M., et al., 1990, Proc. Natl. Acad. Sci. 87: 
419-423; Chiacchia, K, B., 1991, Biochem, Biophys. Res. Commun. 176, 1178- 
1182; Schaffer, L. & Ljungqvist, L., 1992, Biochem. Biophys. Res. Comm. 189: 
650-653; Sparrow, L. G., et al., 1997. J. Biol. Chem. 47: 29460-29467). 

Although the 3D structures of the ligands EGF, TGF-alpha (Hommel, 
U„ et al., 1992, J. Mol. Biol. 227:271-282), insulin ( Dodson, E. J., et ah, 1983, 
Biopolymers 22:281-291), IGF-1 (Sato, A., et al., 1993, Int J Peptide Protein 
Res 41:433-440) and IGF-2 (Torres, A. M., et al.,1995, J. Mol. Biol. 248:385- 
401} are known and numerous analytical and functional studies of ligand 
binding to EGFR (Soler, C. & Carpenter, G., 1994 In Nicola (ed) Guidebook to 
Cytokines and Their receptors", Oxford Univ. Press, Oxford, ppl94-197), IGF- 
IR and IR (see De Meyts, P., 1994 Diabetologia, 37:135-148) have been 
carried out, the mechanisms of ligand binding and subsequent 
transmembrane signalling have not been resolved. 

Ligand-induced, receptor-mediated phosphorylation is the signalling 
mechanism by which most cytokines, polypeptide hormones and membrane- 
anchored ligands exert their biological effects. The primary kinase may be 
part of the intracellular portion of the transmembrane receptor protein as in 
the tyrosine kinase receptors (for review see Yarden, Y., et al., 1988, Ann. 
Rev. Biochem. 5 7:443-478] or the Ser/Thr kinase receptors (Alevizopoulos, A. 
& Mermod, N., 1997, BioEssays, 19:581-591) or be non-covalently associated 
with the cytoplasmic tail of the transmembrane protein(s) making up the 
receptor complex as in the case of the haemopoietic growth factor receptors 
(Stahl, N, et al., 1995, Science 267:1349-1353). The end result is the same, 
ligand binding leads to receptor dimerization or oligomerization or a 
conformational change in pre-existing receptor dimers or oligomers resulting 
in activation by transphosphorylation, of the covalently attached or non- 
covalently associated protein kinase domains (Hunter, T., 1995, Cell, 80:225- 
236). 

Many oncogenes have been shown to be homologous to growth 
factors, growth factor receptors or molecules in the signal transduction 
pathways (Baserga. R.,1994 Cell, 79:927-930; Hunter, T., 1997 Cell, 88:333- 
346). One of the best examples is v-Erb (related to the EGFR). Since 
overexpression of a number of growth factor receptors results in ligand - 
dependent transformation an alternate strategy for oncogenes is to regulate 



the expression of growth factor receptors or their ligands or to directly bind 
to the receptors to stimulate the same effect (Baserga, R., 1994 Cell, 79:927- 
930). Examples are v-Src, which activates IGF-1 R intracellularly; c-Myb, 
which transforms cells by enhancing the expression of IGFIR and SV40 T 
5 antigen which interacts with the IGF-lR and enhances the secretion of IGF-1 
(see Baserga, R.,1994 Cell, 79:927-930 for review). Cells in which the IGF-1 
receptor has been knocked out cannot be transformed by SV40 T antigen. If 
oncogenes activate growth factors and their receptors then tumour 
suppressor genes should have the opposite effect. One good example of this 

10 is WTl, the Wilm's tumour suppressor gene which suppresses the expression 
of IGF-IR (Drummond, J. A., etal., 1992. Science, 257:275-277). Cells that are 
driven to proliferate by oncogenes undergo massive apotosis when growth 
factor receptors are ablated since unlike normal cells, they appear unable to 
withdraw from the cell-cycle and enter into the GO phase (Baserga, R.,1994 

15 Cell, 79:927-930). 

The insulin-like growth factor-1 receptor (IGF-lR) is one of several 
growth-factor receptors that regulate the proliferation of mammalian cells. 
However, its ubiquitousness and certain unique aspects of its function make 
IGF-lR an ideal target for therapeutic interventions against abnormal growth, 

20 with very little effect on normal cells (see Baserga. R., 1996 TIBTECH, 

14:150-152). The receptor is activated by IGFl, IGF2 and insulin and plays a 
major role in cellular proliferation in at least three ways: it is essential for 
optimal growth of cells in vitro and in vivo; several cell types require IGF-lR 
to maintain the transformed state and activated IGF-lR has a protective effect 

25 against apoptotic cell death (Baserga, R., 1996 TIBTECH, 14:150-152). These 
properties alone make it an ideal target for therapeutic interventions. 
Transgenic experiments have shown that IGF-lR is not an absolute 
requirement for cell growth but is essential for the establishment of the 
transformed state (Baserga, R.,1994 Cell. 79: 927-930). In several cases 

30 (human glioblastoma, human melanoma; human breast carcinoma; human 
lung carcinoma; human ovaraian carcinoma; human rhabdomyosarcoma; 
mouse melanoma, mouse leukaemia; rat glioblastoma: rat 
rhabdomyosarcoma; hamster mesothelioma ) the transformed phenotype can 
be reversed by decreasing the expression of IGF-lR using antisense to IGF-lR 

35 (Baserga. R., 1996 TIBTECH 14:150-152); or interfering with its function by 
antibodies to IGF-lR (human breast carcinoma; human rhabdomyosarcoma) 



or by dominant negatives of IGF-IR (rat glioblastoma; Baserga, R.,1996 
TIBTECH 14:150-152). 

Three effects are observed when the function of IGF-IR is impaired: 
tumour cells undergo massive apoptosis which results in inhibition of 
5 tumourogenesis; surviving tumour cells are eliminated by a specific immime 
response; and such a host response can cause a regression of an established 
wild-type tumour (Resnicoff, M., et al, 1995, Cancer Res. 54:2218-2222). 
These effects, plus the fact that interference of IGF-lR function has a limited 
effect on normal cells (partial inhibition of growth without apoptosis) makes 

10 IGF-lR a unique target for therapeutic interventions (Baserga, R., 1996 
TIBTECH 14:150-152). In addition IGF-lR is downstream of many other 
growth factor receptors, which makes it an even more generalised target. The 
implication of these findings is that if you can decrease the number of IGF-1 
receptors on cells or antagonise their function then tumours cease to grow 

15 and can be removed immunologically. These studies establish that IGF-lR 
antagonists will be extremely important therapeutically. 

Many cancer cells have constitutively active EGFR (Sandgreen, E. P., 
et al., 1990, Cell, 61:1121-135: Karnes, W. E. J., et al., 1992, Gastroenterology, 
102:474-485) or other EGFR family members (Hines, N. E.,1993, Semin. 

20 Cancer Biol. 4:19-26). Elevated levels of activated EGFR occur in bladder, 

breast, lung and brain tumours (Harris, A. L., et al., 1989, In Furth & Greaves 
(eds) The Molecular Diagnostics of human cancer. Cold Spring Harbor Lab. 
Press. CSH, NY, pp35 3-357). Antibodies to EGFR can inhibit ligand activation 
of EGFR (Sato, J. D., et al., 1983 Mol. Biol. Med. 1:511-529) and the growth of 

25 many epithelial cell lines (Aboud-Pirak E., et al., 1988, J. Natl Cancer Inst. 
85:1327-1331). Patients receiving repeated doses of a humanised chimeric 
anti-EGFR antibody showed signs of disease stabilization. The large doses 
required and the cost of production of humanised Mab is likely to limit the 
application of this type of therapy. These findings indicate that the 

30 development of EGF antagonists will be attractive anticancer agents. 
Summary of the Invention 

The present inventors have now obtained 3D structural information 
concerning the insulin-like growth factor receptor (IGF-lR) and the insulin 
receptor (IR) which provides a rational basis for the development of 

35 antagonists and agonists of the polypeptide ligands for specific therapeutic 
applications. This information can be used to predict the structure of related 



members of the insulin receptor family and epidermal growth factor family 
and to develop agonists and antagonists of their respective polypeptide 
ligands. 

Accordingly, in a first apsect the present invention provides a method 
of screening for, or designing, an agonist of a ligand of an insulin receptor 
family member or EGF receptor family member which method includes 

(i) selecting or designing a substance which possesses 
stereochemical complementarity to a receptor site, wherein the receptor site 
is characterised by 

(a) amino acids 1-462 of IGF-lR positioned at atomic 
coordinates substantially as shown in Figure 1 or a subset thereof; or 

(b) amino acids derived from an insulin receptor family 
member or EGF receptor family member which form an equivalent structure 
to the amino acids defined in paragraph (a); and 

(ii) testing the substance for the ability to act as an agonist of the 
ligand of an insulin receptor family member or EGF receptor family member. 

In a second apsect the present invention provides a method of 
screening for, or designing, an antagonist of a ligand of an insulin receptor 
family member or EGF receptor family member which method includes 

(i) selecting or designing a substance which possesses 
stereochemical complementarity to a receptor site, wherein the receptor site 
is characterised by 

(a) amino acids 1-462 of IGF-lR positioned at atomic 
coordinates substantially as shown in Figure 1 or a subset thereof; or 

(b) amino acids derived from an insulin receptor family 
member or an EGF receptor family member which form an equivalent 
structure to the amino acids defined in paragraph (a); and 

(ii) testing the substance for the ability to act as an antagonist of the 
ligand of an insulin receptor family member or EGF receptor family member. 

The phrase "insulin receptor {amiiy* encompasses, for example, IGF- 
lR. IR and IRR. The phrase "EGF receptor family" encompasses for example, 
EGFR, ErbB2, ErbB3 and ErbB4. In general, insulin receptor family members 
and EGF receptor family members show similar domain arrangements and 
share significant sequence identity (preferably at least 20% identity between 
the families and at least 40% identity within each family). 
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The receptor site defined in the first and second aspects of the 
present invention comprises the Ll-cysteine rich-L2 region (residues 1-462) 
of the ectodomain of IGF-lR. At the centre of this structure is a groove, 
bounded by all three domains, of sufficient size to accommodate a ligand 
5 molecule. By "stereochemical complementarity" we mean that the 

biologically active substance or a portion thereof correlates, in the manner of 
the classic "lock-and-key" visualisation of ligand-receptor interaction, with 
the groove in the receptor site. Preferably, the stereochemical 
complementarity is such that the compound has a IQ for the receptor site of 

10 less than lO'^M. More preferably, the Kj value is less than 10'°M and more 
preferably less than lO'^M. 

In preferred embodiments of the first and second aspects of the 
present invention, the method further involves selecting or designing a 
svibstance which has portions that match residues positioned on the surface 

15 of the receptor site which faces the groove. By "match" we mean that the 
identified portions interact wdth the surface residues, for example, via 
hydrogen bonding or by enthalpy-reducing Van der Waals interactions which 
promote desolvation of the biologically active substance within the site, in 
such a way that retention of the biologically active substance within the 

20 groove is favoured energetically. 

In a preferred embodiment of the first aspect of the present invention, 
the method includes screening for, or designing, a substance which possesses 
a stereochemistry and/or geometry which allows it to interact with both the 
LI and L2 domains of the receptor site. As described above, the insulin 

25 receptor exists as homodimers held together by disulphide bonds. Electron 
miscroscopy studies described herein indicate that the insulin receptor 
monomers dimerise in nature in such a manner that the grooves of each 
monomer may face each other. Accordingly, the method of the first aspect of 
the present invention may involve screening for, or designing, a biologically 

30 active substance which interacts with the Ll domain of one monomer and 
the L2 domain of the other monomer. 

In a third aspect the present invention provides a method of selecting 
or designing an agonist of a ligand of an insulin receptor family member or 
EGF receptor family member which method includes 

35 (i) selecting or designing a substance which interacts with 



(a) a fragment of IGF-lR characterised by amino acids 1-462 
positioned at atomic coordinates substantially as shown in Figure 1 or a 
subset thereof; or 

(b) a fragment derived from an insulin family receptor member 
or EGF receptor family member which is equivalent to the fragment defined 
in paragraph (a); 

wherein the interaction of the substance with the fragment alters the 
position of at least one of the Ll, L2 or cys-rich domains of the fragment 
relative to the position of at least one of the other domains; and 

(ii) testing the substance for the ability to act as an agonist of the 
ligand of an insulin receptor family member or EGF receptor family member. 

In a preferred embodiment of the third aspect of the present 
invention the substance interacts with the fragment in the region of the Ll 
domain-cys rich domain interface, causing the Ll and cys-rich domains to 
move away from each other. In a further preferred embodiment the 
substance interacts with the hinge region between the L2 domain and the 
cys-rich domain causing an alteration in the positions of the domains relative 
to each other. In a further preferred embodiment the substance interacts 
with the beta sheet of the Ll domain causing an alteration in the position of 
the Ll domain relative to the position of the cys-rich domain or L2 domain. 

In a fourth aspect the present invention provides an agonist of a 
ligand of an insulin receptor family member or EGF receptor family member 
obtained by a method according to the first or third aspects of the present 
invention. 

In a fifth aspect the present invention provides an antagonist of 
ligand of an insulin receptor family member or EGF receptor family member 
obtained by a method according to the second aspect of the present 
invention. 

The agonists or antagonists of the fourth and fifth aspects of the 
present invention may be mutant insulin family member or EGF family 
member ligands where at least one mutation occurs in the region of the 
ligand which interacts with residues on the surface of the receptor site facing 
toward the groove. For example, the IGF-1 ligand has a predominance of 
basic residues in the C region which may interact with the acidic patch of the 
cys-rich region near Ll. An acidic patch on the other side of the ligand may 
interact with the patch of basic residues (residues 307-310) on the N-terminal 
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end of L2. Accordingly, mutants of IGF-1 which exhibit altered activity may 
be generated by introducing modifications in the C region of IGF-1 or 
residues in the acidic patch on the other side of the hormone. 

In a sixth aspect the present invention provides a substance which 
5 possesses stereochemical complementarity to a receptor site, wherein the 
receptor site is characterised by 

(a) amino acids 1-462 of IGF-lR positioned at atomic 
coordinates substantially as shown in Figure 1 or a subset thereof; or 

(b) amino acids derived from an insulin receptor family 
10 member or an EGF receptor family member which form an equivalent 

structure to the amino acids defined in paragraph (a); 

with the proviso that the substance is not a naturally occurring ligand 
of an insulin receptor family member or EGF receptor family member or a 
mutant thereof. 

15 By "mutant" we mean a ligand which has been modified by one or 

more point mutations, insertions of amino acids or deletions of amino acids. 

In a preferred embodiment of the sixth aspect of the present 
invention, the stereochemical complementarity is such that the compound 
has a Ki for the receptor site of less than lO'^M. More preferably, the Kj value 

20 is less than lO'^M and more preferably less than lO'^M. 

In a seventh aspect the present invention provides a pharmaceutical 
composition for treatment of a disease associated with reduced activity of a 
ligand of an insulin receptor family member or EGF receptor family member 
which includes an agonist obtained by a method according to the first or 

25 third aspects of the present invention and a pharmaceutically acceptable 
carrier or diluent. 

In an eighth aspect the present invention provides a pharmaceutical 
composition for treatment of a disease associated with activity of a ligand of 
an insulin receptor family member or EGF receptor family member which 

30 includes an antagonist obtained by a method according to the second aspect 
of the present invention and a pharmaceutically acceptable carrier or diluent. 

In a ninth aspect the present invention provides a method of 
preventing or treating a disease associated with reduced activity of a ligand 
of an insulin receptor family member or EGF receptor family member which 

35 method includes administering to a subject in need thereof an agonist 



obtained by a method according to the first or third aspects of the present 
invention. 

Diseases associated with reduced activity of a ligand of an insulin 
receptor family member or EGF receptor family member include diabetes, 
osteoporosis, nerve degeneration and a range of catabolic states. 

In a tenth aspect the present invention provides a method of 
preventing or treating a disease associated with activity of a ligand of an 
insulin receptor family member or EGF receptor family member which 
method includes administering to a subject in need thereof an antagonist 
obtained by a method according to the second aspect of the present 
invention. 

Diseases associated with activity of a ligand of an insulin receptor 
family member or EGF receptor family member include cancer, leukaemia 
and many types of tumour states including but not restricted to breast cancer, 
brain tumours, ovarian cancer, pancreatic tumours, lung cancer, melanoma, 
rhabdomyosarcoma, mesothelioma and glioblastoma. 

Brief Description of the Drawings 

Figure 1. IGF-IR residues 1-462, in terms of atomic coordinates refined to a 
resolution of 2.6 A (average accuracy ^ 0.3A). The coordinates are in relation 
to a Cartesian system of orthogonal axes. 

Figure 2. Depiction of the residues lining the groove of the IGF-lR receptor 
fragment 1-462. 

Figure 3. Gel filtration chromatography of affinity-purified IGF-lR/462 
protein. The protein was purified on a Superdex S200 column (Pharmacia) 
fitted to a BioLogic L.C. system (Biorad), equilibrated and eluted at 0.8 
ml/min with 40 mM Tris/150 mM NaCl/0.02% NaN3 adjusted to pH 8.0. (a) 
Protein eluting in peak 1 contained aggregated IGF-lR/462 protein, peak 2 
contained monomeric protein and peak 3 contained the c-myc undecapeptide 
used for elution from the Mab 9E10 immunoaffinity column, (b) Non- 
reduced SDS-PAGE of fraction 2 from IGF-lR/462 obtained following 
Superdex S200 (Fig. la). Standard proteins are indicated. 
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Figure 4. Ion exchange chromatography of affinity-purified, truncated IGF- 
IR ectodomain. A mixture of gradient and isocratic elution chromatography 
was performed on a Resource Q column (Pharmacia) fitted to a BioLogic 
System (Biorad), using 20 mM Tris/pH 8.0 as buffer A and the same buffer 
5 containing IM NaCl as buffer B. Protein solution in TBSA was diluted at least 
1:2 with water and loaded onto the colunm at 2 ml/min. Elution was 
monitored by absorbance (280 nm) and conductivity (mS/cm). Target protein 
(peak 2) eluted isocratically with 20 mM Tris/0.14 M NaCl pH 8.0. Inset: 
Isoelectric focusing gel (pH 3-7; No vex Australia Pty Ltd) of fraction 2. The 
10 pi was estimated at 5.1 from standard proteins (not shown). 

Figure 5. Gel filtration chromatography of affinity purified IR/485 protein. 
Affinity-purified material at 1 mg/ml produced a dominant peak at apparent 
mass — 140 kDa (interpreted as a dimer) (a); whereas affinity-purified 
material at 0.02 mg/ml produced a dominant peak at apparent mass — 85kDa 
(interpreted as a monomer) (b). 




Figure 6. (a) SDS-PAGE of IR/485 following gel filtration chromatography. 
The protein migrated as a single broad band of apparent molecular mass — 78 

20 kDa (reduced - lane A) or — 68kDa (non-reduced - lane B). (b) Isoelectric 

focussing of the IR/485 protein. The IR/485 fragment reacted positively in an 
ELISA with Mab 83-7, gave a single sequence corresponding to the N- 
terminal 10 residues of IR, showing several isoforms on isoelectric focussing 
from pI6.0-6.8. The fragment was further purified by ion-exchange 

25 chromatography on Uno Q (BioRad, USA), using stepwise isocratic elution 
with incremental changes in salt concentrations (see Figure 7). Fractions A 
and D were each enriched in a component isoform from the ladder of 
isoforms present in the unfractionated mixture. Both these fractions 
produced crystals, whereas no crystals were obtained from fractions B and C. 

30 

Figure 7. Purification of the IR/485 protein by ion-exchange chromatography 
on Uno Q (BioRad, USA), using stepwise isocratic elution with incremental 
changes in salt concentrations. 

35 Figure 8. Polypeptide fold for residues 1-462 of IGF-lR. The LI domain is at 
the top. viewed from the N-terminal end and L2 is at the bottom. The space 
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at the centre is of sufficient size to accommodate IGF-1. Helices are 
indicated by curled ribbon and b-strands by arrows. Cysteine side chains are 
drawn as ball-and -stick with lines showing disulfide bonds. The arrow 
points in the direction of view for Figure 9. 

Figure 9. Amino acid sequences of IGF-IR and related proteins, a, Ll and L2 
domains of human IGF-lR and IR are shown based on a sequence alignment 
for the two proteins and a structural alignment for the Ll and L2 domains. 
Positions showing conservation physico-chemical properties of amino acids 
are boxed, residues used in the structural alignment are shaded yellow and 
residues which form the Trp 176 pocket are in red. Secondary structure 
elements for Ll (above the sequences) and L2 (below) are indicated as 
cylinders for helices and arrows for b-strands. Strands are colour coded 
according to the b-sheet to which they belong. Disulfide bonds are also 
indicated, b, Cys-rich domains of human IGF-IR, IR and EGFR (domains 2 
and 4) are aligned based on sequence and structural considerations. 
Secondary structural elements and disulfide bonds are indicated above the 
sequences. The dashed bond is only present in IR. Different types of 
disulfide bonded modules are labelled below the sequences as open, filled or 
broken lines. Boxed residues show conservation of physico-chemical 
properties and structurally conserved residues for modules 4-7 are shaded 
yellow. Residues from EGFR which do not conform to the pattern are shaded 
grey and the conserved Trp 176 and the semi-conserved Gin 182 are shaded 
red. This figure was prepared using ALSCRIPT (Barton, G. J., 1993, Prot. 
Engineering, 6:37-40). 

Figure 10. Stereo view of a superposition of the Ll (white) and L2 (black) 
domains. Residues numbers above are for Ll and below for L2. The side 
chain of Trp 176 which protrudes into the core of Ll is drawn as ball-and- 
stick. 

Figure 11. Schematic diagram showing the association of three |3-finger 
motifs. (3 -strands are drawn as arrows and disulfide bonds as zigzags. 

Figure 12. GRASP [Nicolls, A. et aL, 1993, Biophys. }. 64, 166-170].surface 
diagram of the Ll domain of IGF-lR shown in a similar view to Figure 8. The 
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N-terminal p-strand is at the top. The mutation L87A [ Nakae, J, et al., 1995, 
J. Biol. Chem. 270, 22017-22022] and four regions (residues 12-15, 34-44, 64- 
67 and 89-91 of IR) shown to be important in insulin binding to IR [Williams, 
P. F. et al., 1995, J. Biol. Chem. 270, 3012-3016] correspond to a patch of 
residues on the large p-sheet Residues numbers for IR/IGF-lR are given and 
residues are coloured according to the magnitude of Kd(mutant)/Kd(wild 
type), red, > 40; orange, 10-40; yellow, 2.5-10; green, < 2.5; non-secreting, 
white; untested, blue. All mutants on the opposite face of the domain do not 
affect insulin affinity. 

Figure 13: Sequence Alignment of hIGF-lR, hIR and hIRR Ectodomains. 
Derived by use of the PileUp program in the software package of the Genetics 
Computer Group, 575 Science Drive, Madison, Wisconsin, USA. 
For assignment of homologous 3D structures see Figure 9. 

Figure 14: Sequence Alignment of EGFR, ErbB2, ErbB3 and ErbB4 
Ectodomains. Derived by use of the PileUp program in the software package 
of the Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, 
USA. For alignment on the IGF-lR fragment and assignment of homologous 
3D structures, see Figure 9. 

Figure 15 Sequence Alignment and Classification of the Disulphide-bonded 
Modules in the Cys-rich domains of IGF-lR, IR, IRR, EGFR, ErbB2, ErbB3 
and ErbB4. 

Figure 16. Gel filtration chromatography of insulin receptor ectodomain 
and MFab complexes. hIR -11 ectodomain dimer (5 - 20 mg) was complexed 
with MFab derivatives (15-25 mg each) of the anti-hIR antibodies 18-44, 83-7 
and 83-14 (Soos et al., 1986). Elution profiles were generated from samples 
loaded onto a Superdex S200 column (Pharmacia), connected to a BioLogic 
chromatography system (Biorad) and monitored at 280 nm. The column was 
eluted at 0.8 ml/min with 40 mM Tris/150 mM sodium chloride/0.02% 
sodium azide buffer adjusted to pH 8.0: Profile 0, hIR -llectodomain, Profile 
1, ectodomain mixed with MFab 18-44; Profile 2 , ectodomain mixed with 
MFabl8-44 and MFab 83-14; Profile 3, ectodomain mixed with MFab 18-44, 
MFab 83-14 and MFab 83-7. The apparent mass of each complex was 
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determined from a plot of the following standard proteins: thyroglobulin (660 
kDa), ferritin (440 kDa), bovine gammaglobulin (158 kDa), bovine serum 
albumin (67 kDa), chicken ovalbumin (44 kDa) and equine myoglobin (17 
kDa). 

Figure 17. Micrographs of hIR and hIGF-lR ectodomains.(a) Undecorated 
hIR ectodomain dimer stained with methylamine tungstate showing parallel 
bars, (b) Undecorated hIR ectodomain dimer stained with xiranyl formate, 
showing well-spaced psu^allel bars corresponding to the cartoon below, 
(c) Undecorated hIGF-lR ectodomain dimer stained with uranyl formate. 
Magnification bars for (a), (b) and (c) 50nm. 

Figure 18. Micrographs of hIR and hIGF-lR ectodomains. (a) Thinly stained 
region of undecorated hIR ectodomain dimers in uranyl formate, showing U- 
shaped particles (circled) as well as parallel bars as in the cartoon below, (b) 
Undecorated hIGF-lR ectodomain dimer under similar staining conditions. 
Magnification bars 50 nm. 

Figure 19. hIR ectodomain dimer complexed with MFab 83-7 and stained 
with KPT. Three projections can be recognised: circled particles have the Fab 
arms displaced either clockwise as in the cartoon belovv left,or anticlockwise 
as in the cartoon below middle; arrowed particles have the Fab arms in a 
central position, cartoon below right. Magnification bar 50 nm. 

Figure 20. hIR ectodomain dimer complexed with MFab 83-7 and stained 
with uranyl formate showing the parallel bar structure in particles having the 
Fab arms displaced (circled). Magnification bar 50 nm. 

Figure 21. (a) hIR ectodomain dimer complexed with MFab 83-14 stained 
with potassium phosp ho tungstate, showing Fab arms attached near the 
bottom of U-shaped particles (circled). The corresponding cartoon is shown 
below left, (b) hIR ectodomain dimer complexed with MFab 83-14 stained 
with uranyl acetate, showing both the view described above (circled) and the 
parallel-bar view with diagonally projecting Fab arms (arrowed), as in the 
cartoon below right. Magnification bars 50 nm. 
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Figure 22 . Double complex of hIR ectodomain dimer with MFabs 83-7 and 
18-44 showing particles of complex shape (circled) with four Fab arms 
attached, consistent with the cartoon below. Magnification bar 50 nm. 

5 Figure 23. Images of hIR ectodomain dimer co-complexed with MFabs 83-7, 
83-14 and 18-44 showing examples of complex particles (circled) where it is 
possible to identify that there are more than four MFabs bound to the dimeric 
central region. Magnification bar 50 nm. 

10 Figure 24. Schematic illustrating the proposed model of the hIR ectodomain 
dimer. The dimensions of the molecular envelope are as shown in the 
diagram, as is the position of the two-fold axis. 

Detailed Description of the Invention 

15 We describe herein the expression, purification, and crystallization of 

a recombinant IGF-lR fragment (residues 1-462) containing the Ll-cysteine- 
rich-L2 region of the ectodomain. The selected truncation position is just 
downstream of the exon 6/exon 7 junction (Abbott, A. M., et al., 1992. J Biol 
Chem., 267:10759-10763) and occurs at a position where the sequences of the 

20 IR and EGFR families diverge markedly (Ward, C. W., et aL,1995, Proteins: 
Struct,, Funct., Genet. 22:141-153; Lax. L, et al., 1988, Molec. Cellul. Biol. 
8:1970-1978) suggesting it represents a domain boundaiy. To limit the effects 
of glycosylation, the IGF-lR fragment was expressed in Lec8 cells, a 
glycosylation mutant of Chinese hamster ovary (CHO) cells, whose defined 

25 glycosylation defect produces N-linked oligosaccharides truncated at N- 
acetyl glucosamine residues distal to mannose residues (Stanley, P. 1989, 
Molec. Cellul. Biol. 9:377-383). Such an approach has facilitated glycoprotein 
crystallization (Davis. S. J., et al., 1993, Protein Eng. 6:229-232; Liu, J., et al., 
1996, J. Biol. Chem. 271:33639-33646). 

30 The IGF-lR construct described herein included a c-myc peptide tag 

(Hoogenboom, H. R., et al..l991. Nucleic Acids Res. 19:4133-4137) that is 
recognised by the Mab 9E10 (Evan, G. I., et al., 1985, Mol. Cell. Biol. 5:3610- 
3616) enabling the expressed product to be purified by peptide elution from 
an antibody affinity column followed by gel filtration over Superdex S200. 

35 The purified proteins crystallized under a sparse matrix screen (Jancarik, J. & 
Kim. S.-H., 1991, J. Appl. Cryst. 24:409-411) but the crystals were of variable 
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quality, with the best diffracting to 3.0-3.5A. Isocratic gradient elution by 
anion-exchange chromatography yielded protein that was less heterogenous 
and gave crystals of sufficient quality to determine the structure of the first 
three domains of the human IGF-IR. 

The IGF-lR fragment consisted of residues 1-462 of IGF-lR linked via 
an enterokinase-cleavable pentapeptide sequence to an eleven residue c-myc 
peptide tag at the C-terminal end. The fragment was expressed in Lec8 cells 
by continuous media perfusion in a bioreactor using porous carrier disks. It 
was secreted into the culture medium and purified by peptide elution from 
an anti-c-myc antibody column followed by Superdex S200 gel filtration. The 
receptor fragment bound two anti-IGF-lR monoclonal antibodies, 24-31 emd 
24-60, which recognize conformational epitopes, but could not be shown to 
bind IGF-1 or IGF-2. Crystals of variable quality were grown as rhombic 
prisms in 1.7 M ammonium sulfate at pH 7.5 with the best diffracting to 3.0- 
3.5 A. Further purification by isocratic elution on an anion-exchange column 
gave protein which produced better quality crystals, diffracting to 2.6 A, that 
were suitable for X-ray structure determination. 

The structure of this fragment (IGF-lR residues 1-462; Ll-cys rich- 
L2domains) has been determined to 2.6 A resolution by X-ray diffraction. The 
L domains each adopt a compact shape consisting of a single stranded right- 
handed p-helix. The cys-rich region is composed of eight disulphide-bonded 
modules, seven of which form a rod-shaped domain with modules associated 
in a novel manner. At the centre of this reasonably extended structure is a 
space, bounded by all three domains, and of sufficient size to accommodate a 
ligand molecule. Functional studies on IGF-lR and other members of the 
insulin receptor family show that the regions primarily responsible for 
hormone-binding map to this central site. Thus this structure gives a first 
view of how members of the insulin receptor family might interact with their 
ligands. 

Another group has reported the crystallization of a related receptor, 
the EGFR in a complex with its ligand EGF (Weber, W., et al., 1994, J 
Chromat. 679:181-189). However difficulties were encountered with these 
ciystals which diffracted to only 6 A, insufficient for the determination of an 
atomic resolution structure of this complex (Weber, W,, et al., 1994, J 
Chromat 679:181-189) or the generation of accurate models of structurally 
related receptor domains such as IGF-lR and IR by homology modelling. 
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The present inventors have applied the same process to the IR and 
generated a fragment (residues 1-485) that covers the first three domains of 
the IR. This fragment has been expressed in transformed Lec8 cells, purified, 
and crystallized by similar methodologies to yield crystals suitable for X-ray 
diffraction. 

The present inventors have therefore developed 3D structural 
information about cytokine receptors to enable a more accurate 
understanding of how the binding of ligand leads to signal transduction. 
Such information provides a rational basis for the development of antagon^ts 
or agonists for specific therapeutic applications, something that heretofore 
could not have been predicted de novo from available sequence data. 

The precise mechanisms underlying the binding of agonists and 
antagonists to the IGF-1 receptor site are not fully clarified. However, the 
binding of the agonists or antagonists to the receptor site, preferably with an 
affinity in the order of lO'^M or higher, is understood to arise from enhanced 
stereochemical complementarity, relative to naturally occurring IGF-1 
ligands. 

Such stereochemical complementarity, pursuant to the present 
invention, is characteristic of a molecule that matches intra-site surface 
residues lining the groove of the receptor site as eneume rated by the 
coordinates set out in Figure 1. The residues lining the groove are depicted 
in Figure 2. Substances which are complemetary to the shape of the receptor 
site characterised by amino acids positioned at atomic coordinates set out in 
Figure 1 may be able to bind to the receptor site and, when the binding is 
sufficiently strong, substantially prohibit binding of the naturally occurring 
ligands to the site. 

It will be appreciated that it is not necessary that the 
complementarity between agonists or antagonists and the receptor site 
extend over all residues lining the groove in order to inhibit binding of the 
natural ligand. Accordingly, agonists or antagonists which bind to a portion 
of the residues lining the groove are encompassed by the present invention. 

In general, the design of a molecule possessing stereochemical 
complementarity can be accomplished by means of techniques that optimize, 
either chemically or geometrically, the "fit" between a molecule and a target 
receptor. Known techniques of this sort are reviewed by Sheridan and 
Venkataraghavan, Acc. Chem Res. 1987 20 322; Goodford, J. Med. Chem. 
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1984 27 557; Beddell, Chem. Soc, Reviews 1985, 279; Hoi, Angew. Chem. 
1986 25 767 and Verlinde C.L.MJ & Hoi, W.G.J. Structure 1994, 2, 577, the 
respective contents of which are hereby incorporated by reference. See also 
Blundell et al.. Nature 1987 326 347 (drug development based on information 
regarding receptor structure). 

Thus, there are two preferred approaches to designing a molecule, 
according to the present invention, that complements the shape of IGF-IR or 
a related receptor molecule. By the geometric approach, the number of 
internal degrees of freedom (and the corresponding local minima in the 
molecular conformation space) is reduced by considering only the geometric 
(hard-sphere) interactions of two rigid bodies, where one body (the active 
site) contains "pockets" or "grooves" that form binding sites for the second 
body (the complementing molecule, as ligand). The second preferred 
approach entails an assessment of the interaction of respective chemical 
groups ("probes") with the active site at sample positions within and around 
the site, resulting in an array of energy values from which three-dimensional 
contour surfaces at selected energy levels can be generated. 

The geometric approach is illustrated by Kuntz et al., J. Mol. Biol. 
1982 161 269, the contents of which are hereby incorporated by reference, 
whose algorithm for ligand design is implemented in a commercial software 
package distributed by the Regents of the University of California and further 
described in a document, provided by the distributor, which is entitled 
"Overview of the DOCK Package, Version 1.0,", the contents of which are 
hereby incorporated by reference. Pursuant to the Kuntz algorithm, the 
shape of the cavity represented by the IGF-Rl site is defined as a series of 
overlapping spheres of different radii. One or more extant data bases of 
crystallographic data, such as the Cambridge Structural Database System 
maintained by Cambridge University (University Chemical Laboratory, 
Lensfield Road, Cambridge CB2 lEW, U.K.) and the Protein Data Bank 
maintained by Brookhaven National Laboratory (Chemistry Dept. Upton, NY 
11973, U.S.A.), is then searched for molecules which approximate the shape 
thus defined. 

Molecules identified in this way, on the basis of geometric 
parameters, can then be modified to satisfy criteria associated with chemical 
complementarity, such as hydrogen bonding, ionic interactions and Van der 
Waals interactions. 
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The chemical-probe approach to Hgand design is described, for 
example, by Goodford, J. Med. Chem. 1985 28 849. the contents of which are 
hereby incorporated by reference, and is implemented in several commercial 
software packages, such as GRID (product of Molecular Discovery Ltd., West 
5 Way House, Elms Parade, Oxford OX2 9LL, U.K.). pursuant to this approach, 
the chemical prerequisites for a site-complementing molecule are identified 
at the outset, by probing the active site (as represented via the atomic 
coordinates shown in Fig. 1) with different chemical probes, e.g.. water, a 
methyl group, an amine nitrogen, a carboxyl oxygen, and a hydroxyl. 

10 Favored sites for interaction between the active site and each probe are thus 
determined, and from the resulting three-dimensional pattern of such sites a 
putative complementary molecule can be generated. 

The chemical-probe approach is especially useful in defining variants 
of a molecule known to bind the target receptor. Accordingly, 

15 crystallographic analysis of IGF-1 bound to the receptor site may provide 
useful information regarding the interaction between the archetype ligand 
and the active site of interest. 

A further use of the structure of IGF-lR fragment described here is in 
facilitating structure determination of a related protein such as a larger 

20 fragment of this receptor, another member of the insulin receptor family or a 
member of the EGF receptor family. This new structure could be either alone 
or in complex with its ligand. For crystallographic analysis this is achieved 
using the method of molecular replacement (Brunger, Meth. Enzym. 1997 276 
558-580, Navaza and Saludjian, ibid. 581-594, Tong and Rossmann, ibid. 594- 

25 611, Bentley, ibid, 611-619) in a program such as XPLOR. In this procedure 
diffraction data is collected from a crystalline protein of unknown structure. 
A transform of these data (Patterson function) is compared with a Patterson 
function calculated from a known structure. Firstly, the one Patterson 
function is rotated on the other to determine the correct orientation of the 

30 unknown molecule in the crystal. The translation function is then calculated 
to determine the location of the molecule with respect to the ciystal axes. 
Once the molecule has been correctly positioned in the unit cell initial 
phases for the experimental data may be calculated. These phases are 
necessary for calculation of an electron density map from which structural 

35 differences may be obsei-ved and for refinement of the structure. Due to 

limitations in the method the search molecule must be structurally related to 
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that which is to be determined. However it is sufficient for only part of the 
unknown structure (e.g. < 50%) to be similar to the search molecule. Thus 
the three dimensional structure of IGF-IR residues 1-462 may be used to 
solve structures consisting of related receptors, enabling a program of drug 
5 design as outlined above. 

In summary, the general principles of receptor-based drug design can 
be applied by persons skilled in the art, using the ciystallographic results 
presented above, to produce agonists or antagonists of IGF-lR or other related 
receptors, having sufficient stereochemical complementarity to exhibit high 
10 affinity binding to the receptor site. 

The present invention is further described below with reference to 
the following, non-limiting examples. 

EXAMPLE 1 

15 Expression, Purification and Crystalization of the IGF-lR Fragment 

Several factors hamper macromolecular crystallization including 
sample selection, purity, stability, solubility (McPherson, A., et al., 1995, 
Structure 3:759-768); Gilliland, G. L., & Ladner, J. E., 1996, Curr. Opin. 
Struct, Biol. 6:595-603), and the nature and extent of glycosylation (Davis, S. 

20 J., et al., 1993, Protein Eng. 6:229-232), Initial attempts to obtain structural 
data from soluble IGF-lR ectodomain (residues 1-906) protein, expressed in 
Lec8 cells (Stanley, P. 1989, Molec. Cellul. Biol. 9:377-383) and purified by 
affinity chromatography, produced large, well-formed crystals (1.0 mm x 0,2 
mm X 0.2 mm) which gave no discernable X-ray diffraction pattern 

25 (unpublished data). Similar difficulties have been encountered with crystals 
of the structurally related epidermal growth factor receptor (EGFR) 
ectodomain which diffracted to only 6 A, insufficient for the determination 
of an atomic resolution structure (Weber. W. et al., 1994, J Chromat 679:181- 
189). This prompted us to search for a fragment of IGF-lR that was more 

30 amenable to X-ray crystallographic studies. 

The fragment expressed (residues 1-462) comprises the Ll-cysteine- 
rich-L2 region of the ectodomain. The selected truncation position at Val462 
is four residues downstream of the exon 6/exon 7 junction (Abbott. A. M.. et 
al., 1992, J Biol Chem. 267:10759-10763) and occurs at a position where the 

35 sequences of the IR and the structurally related EGFR families diverge 

markedly (Lax, I., et al., 1988, Molec Cell Biol. 8:1970-1978; Ward, C. W., et 
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al., 1995, Proteins: Struct, Funct., Genet. 22:141-153), suggesting it 
represents a domain boundary. The expression strategy included use of the 
pEEl4 vector (Bebbington, C. R. & Hentschel, C. C. G., 1987, In: Glover, D. 
M., ed. DNA Cloning, Academic Press, San Diego. Vol 3, pl63) in 
glycosidase-defective Lec8 cells (Stanley, P., 1989, Molec. Cellul, Biol. 9:377- 
383), which produce N-linked oligosaccharides lacking the terminal galactose 
and N-acetylneuraminic acid residues (Davis, S. J., et al., 1993, Protein Eng. 
6:229-232; Liu, T., et al., 1996, J Biol Chem 271:33639-33646.). The construct 
contained a C-terminal c-myc affinity tag (Hoogenboom, H. R., et al., 1991, 
Nucl Acids Res. 19:4133-4137), which facihtated immunoaffinity purification 
by specific peptide elution and avoided aggressive purification conditions. 
These procedures yielded protein which readily crystallized after a gel 
filtration polish. This provided a general protocol to enhance crystallisation 
prospects for labile, multidomain glycoproteins. 

The structure of this fragment is of considerable interest since it 
contains the major determinants governing insulin and IGF-1 binding 
specificity (Gustafson, T. A. & Rutter, W. J77-1990, J. Biol. Chem. 265:18663- 
18667; Andersen, A. S., et al., 1990, Biochemistry, 29:7363-7366: 
Schumacher, R., et al., 1991, J. Biol, Chem. 266:19288-19295; Schumacher, 
R., et al., 1993, J. Biol. Chem. 268:1087-1094; Schaffer, L., et al., 1993, J. Biol. 
Chem. 268:3044-3047; Williams, P. F., et al., 1995, , J. Biol. Chem. 270:3012- 
3016) and is very similar to an IGF-lR fragment (residues 1-486) reported to 
act as a strong dominant negative for several growth functions and which 
induces apoptosis of tumour cells in vivo (D'Ambrosio, C, et al., 1996, 
Cancer Res. 56:4013-4020). 

The expression plasmid pEEl4/IGF-lR/462 was constructed by inserting the 
oligonucleotide cassette: 

Aatll 

5'GACGTC GACGATGACGATAAG GAACAAAAACTCATC 
DV DDDDK EQKLI 
(EK cleavage) (c-myc tail) 

SEE D L N (Stop) 
TCAGAAGAGGATCTGAAT TAGAATTC GACGTC 3' 

EcoRI Aatll 



I 
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encoding an enterokinase cleavage site, c-myc epitope tag (Hoogenboom, H. 
R., et al., 1991, Nucleic acids Res. 19:4133-4137) and stop codon into the 
Aatn site (within codon 462) of IGF-1 receptor cDNA in the mammalian 
expression vector pECE (Ebina, Y., et al., 1985, Cell, 40:747-758; kindly 
5 supplied by W. J. Rutter, UCSF, USA), and introducing the DNA comprising 
the 5' 1521 bp of the cDNA (Ullrich, A., et al., 1986, EMBO J. 5:2503-2512) 
ligated to the oligonucleotide cassette into the EcoRI site of the mammalian 
plasmid expression vector pEEl4 (Bebbington, C. R. & Hentschel, C. C. G., 
1987, In: Glover, D. M.. ed. DNA Cloning. Academic Press, San Diego. Vol 3, 

10 pl63; Celltech Ltd., UK). Plasmid pEE14/IGF-lR/462 was transfected into 
Lec8 mutant CHO cells (Stanley, P. 1989, Molec. Cellul. Biol. 9:377-383) 
obtained from the American Tissue Culture Collection (CRL:1737) using 
Lipofectin (Gibco-BRL). Cell lines were maintained after transfection in 
glutamine-free medium (Glascow modification of Eagle's medium (GMEM; 

15 ICN Biomedicals, Australia) and 10% dialysed FCS (Sigma, Australia) 
containing 25 |aM methionine sulphoximine (MSX; Sigma, Australia) as 
described (Bebbington, C. R, & Hentschel, C. C. G., 1987, In: Glover, D. M., 
ed. DNA Cloning. Academic Press, San Diego. Vol 3, pl63). Transfectants 
were screened for protein expression by Western blotting and sand^vich 

20 enzyme-linked immunosorbant assay (ELISA) (Cosgrove, L., et al., 1995, ) 
using monoclonal antibody (Mab) 9E10 (Evan et al., 1985) as the capture 
antibody and either biotinylated anti-IGF-lR Mab 24-60 or 24-31 for 
detection(Soos et al., 1992; gifts from Ken Siddle, University of Cambridge, 
UK). Large-scale cultivation of selected clones expressing IGF-lR/462 was 

25 carried out in a Celligen Plus bioreactor (New Brunswick Scientific, USA) 
containing 70 g Fibra-Cel Disks (Sterilin, UK) as carriers in a 1.25 L working 
volume. Continuous perfusion culture using GMEM medium supplemented 
with non-essential amino acids, nucleosides, 25 ]iM MSX and 10% FCS was 
maintained for 1 to 2 weeks followed by the more enriched DMEM/F12 

30 without glutamine, with the same supplemention for the next 4-5 weeks. The 
fermentation production run was carried out three times under similar 
conditions and resulted in an estimated overall yield of 50 mg of receptor 
protein from 430 L of hai*vested medium. Cell growth was poor during the 
initial stages of the fermentation when GVIEM medium was employed, but 

35 improved dramatically following the switch to the more enriched medium. 
Target protein productivity was essentially constant during the period from 
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— 100 to 700 h of the 760 h fermentation, as measured by ELISA using Mab 
9E10 as the capture antibody and biotinylated Mab 24-31 as the developing 
antibody. 

Soluble IGF-lR/462 protein was recovered from harvested 
fermentation medium by affinity chromatography on columns prepared by 
coupling Mab 9E10 to divinyl sulphone-activated agarose beads (Mini Leak; 
Kem En Tec, Denmark) as recommended by the manufacturer. Mini-Leak 
Low and Medium affinity columns with antibody loadings of 1.5-4.5 mg/ml of 
hydrated matrix were obtained, with the loading range of 2.5-3 mg/ml giving 
optimal performance (data not shown), Mab 9E10 was produced by growing 
hybridoma cells (American Tissue Culture Collection) in serum-free medium 
in the Celligen Plus bioreactor and recovering the secreted antibody (4 g) 
using protein A glass beads (Prosep-A, Bioprocessing Limited, USA). 
Harvested culture medium containing IGF-lR/462 protein was adjusted to pH 
8.0 withTris-HCl (Sigma), made 0.02% (w/v) in sodium azide and passed at 
3-5 ml/min over 50 ml Mab 9E10 antibody columns at 4° C. Bound protein 
was recovered by recycling a solution of 2-10 mg of the undecamer c-myc 
peptide EQKLISEEDLN (Hoogenboom et al., 1991) in 20 ml of Tris-buffered 
saline containing 0.02% sodium azide (TBSA). Between 65% and 75% of the 
product was recovered from the medium as estimated by ELISA, with a 
further 15-25% being recovered by a second pass over the columns. Peptide 
recirculation (—10 times) through the column eluted bound protein more 
efficiently than a single, slower elution. Residual bound protein was eluted 
with sodium citrate buffer at pH 3.0 into 1 M Tris HCl pH 8.0 to neutralize 
the ekiant, and columns were re-equilibrated with TBSA. 

Gel filtration over Superdex S200 (Pharmacia, Sweden), of affinity- 
purified material showed a dominant protein peak at —63 kDa, together with 
a smaller quantity of aggregated protein (Figure 3a). The peak protein 
migrated primarily as two closely spaced bands on reduced , sodium dodecyl 
sulfate polyacrylamide gel electrophoresis (SDS-PAGE; Figure 3b), reacted 
positively in the ELISA with both Mab 24-60 and Mab 24-31, and gave a 
single sequence corresponding to the N-terminal 14 residues of IGF-lR. No 
binding of IGF-1 or IGF-2 could be detected in the solid plate binding assay 
(Cosgrove et al., 1995, Protein Express Purif. 6:789-798). The IGF-lR/462 
fragment was further purified by ion-exchange chromatography on Resource 
Q (Pharmacia, Sweden). Using shallow salt gradients, protein enriched in the 
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slowest migrating SDS-PAGE band was obtained (data not shown), which 
formed relatively large, well-formed crystals (see below). Isoelectric focusing 
showed the presence of one major and two minor isoforms. Protein purified 
on Resource Q with an isocratic elution step of 0.14 M NaCl in 20 mM TrisCl 
at pH 8.0 (fraction 2, Figure 4) showed less heterogeneity on isoelectric 
focusing (Figure 4 inset) and SDS-PAGE (data not shown) and produced 
crystals of sufficient quality for structure determination (see below). 

Crystals were grown by the hanging drop vapour diffusion method 
using purified protein concentrated in Centricon 10 concentrators (Amicon 
Inc, USA) to 5-10 mg/ml in 10-20 mM Tris-HCl pH 8.0 and 0.02% (w/v) azide, 
or 100 mM ammonium sulfate and 0.02% (w/v) azide. A search for 
crystallization conditions was performed initially using the factorial screen 
(Jancarik, J. & Kim, S.-H.,1991, J Appl Cryst 24:409-411) and subsequently 
optimised. Crystals were examined on an M18XHF rotating anode generator 
(Siemens, Germany) equipped with Franks mirrors (MSC, USA) and RAXIS 
lie and IV image plate detectors (Rigaku, Japan). 

From the initial crystallization screen of this protein, crystals of 
about 0.1 mm in size grew in one week. Upon refining conditions, crystals of 
up to 0.6 X 0.4 X 0.4 mm could be grown from a solution of 1.7-2.0 M 
ammonium sulfate, 0.1 M HEPES pH 7.5. The crystals varied considerably in 
shape and diffraction quality, growing predominantly as rhombic prisms with 
a length to width ratio of up to 5:1, but sometimes as rhombic bipyramids, 
the latter form being favoured when using material which had been eluted 
from the Mab 9E10 column at pH 3.0. Each crystal showed a minor 
imperfection in the form of very faint lines from the centre to the vertices. 
Protein from dissolved crystals did not appear to be different from the protein 
stock solution when run on an isoelectric focusing gel. Upon X-ray 
examination, the crystals diffracted to 3.0-4.0 A and were found to belong to 
the space group P2^2^2^ with a = 76.8 A, b = 99.0 A, c = 119.6 A. In the 
diffraction pattern, the crystal variability noted above was manifest as a large 
(1-2°) and anisotropic mosaic spread, with concomitant variation in 
resolution. To improve the quality of the crystals, they were grown in the 
presence of various additives or were recrystallized. These methods failed to 
substantially improve the crystal quality although bigger crystals were 
obtained by recrystallization. The variability in crystal quality appeared to be 
due to protein heterogeneity, as demonstrated by the observation that more 
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highly purified protein, eluted isocratically from the Resource Q column and 
showing one major band on isoelectric focusing (Figure 4 inset), produced 
crystals of sufficient quality for structure determination. These crystals 
diffracted to 2.6 A resolution with cell dimensions, a = 77.0 A, b = 99.5 A, c 
= 120.1 A and mosaic spread of 0,5°. Heavy metal derivatives of the IGF- 
lR/462 crystals have been obtained and are leading to the determination of 
an atomic resolution structure of this fragment, which contains the Ll, 
cysteine-rich and L2 domains of human IGF-IR. 
EXAMPLE 2 

Expression, Purification and Crystalization of the IR Fragment 

A similar strategy was adopted for the human insulin receptor. The 
fragment expressed (residues 1-485) comprises the Ll-cysteine-rich-L2 region 
of the IR ectodomain but extends 13 residues further before the attachment of 
the 17 residue EK cleavage site linker and c-myc tail. The selected truncation 
position corresponds to a unique and convenient Bgl 11 restriction site. The 
expression strategy was also based on the pEEl4 expression vector in 
glycQsidase-defective Lec8 cells and use of a C-terminal c-myc affinity tag 
for immunoaffinity purification by specific peptide elution. These procedures 
yielded IR protein which readily crystallized after a gel filtration polish. 

The expression plasmid pHIR485 was constructed by ligating the 
double-stranded oligonucleotide cassette: 



Bg-l II Xha. I 

5' AGATC TCCGACGATGACGATAAG GAACAAAAACTCATCTCAGAAGAGGATCTGAAT TAG TCTAGA 

KI SDDDDK EQKLISEEDLN 

EK cleavage c-myc tail Stop 



encoding an enterokinase cleavage site, c-myc epitope tag (Hoogenboom, H. 
R., et al., 1991, Nucleic acids Res. 19:4133-4137) and stop codon , to the 
larger 11.1 kilobasepair Bgl II / Xba I fragment isolated from digestion of the 
mammalian expression plasmid pEH3 {a derivative of the mammalian 
plasmid expression vector pEEl4 [Bebbington, C. R. & Hentschel, C. C. G.. 
1987. In: Glover, D. M., ed. DNA Cloning. Academic Press. San Diego. Vol 3, 
pl63: Celltech Ltd., UK] which holds the entire coding sequence of human 
insulin receptor within a Hind III /Xba I fragment). Lec8 mutant CHO cells 
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(Stanley, P. 1989, Molec. Cellul. Biol. 9:377-383) obtained from the American 
Tissue Culture Collection (CRL:1737) were transfected with pHIR485 using 
Lipofectamine (Gibco-BRL). Cell lines were maintained after transfection in 
glutamine-free medium (Glascow modification of Eagle's medium - GMEM; 
ICN Biomedicals, Australia) and 10% dialysed FCS (Sigma, Australia) 
containing 25 |iM methionine sulphoximine (MSX; Sigma, Australia) as 
described (Bebbington, C. R. & Hentschel, C. C. G., 1987, In: Glover, D. M., 
ed. DNA Cloning. Academic Press, San Diego. Vol 3, pl63). Transfectants 
were screened for protein expression by Western blotting and sandwich 
enzyme-linked inmiunosorbant assay (ELISA) (Cosgrove, L., et al., 1995, ) 
using anti-hIR (Mab) 83.7 as the primary antibody and biotinylated 
monoclonal antibody (Mab) 9E10 (Evan et al., 1985) for detection (Soos et al., 
1986; gifts from Ken Siddle, University of Cambridge, UK). 
Large-scale cultivation of selected clones expressing IR/485 was carried out 
in a Celligen Plus bioreactor (New Brunswick Scientific, USA) containing 70 
g Fibra-Cel Disks (Sterilin, UK) as carriers in a 1.25 L working volume. 
Continuous perfusion culture was carried out using DME1VI/F12 without 
glutamine medium (ICN), supplemented with non-essential amino acids, 
nucleosides, 25 MSX and 5 - 10% FCS and resulted in an estimated 
overall yield of 115 mg of receptor protein from 165 L of harvested medium. 
Target protein productivity was essentially constant during the fermentation, 
as measured by ELISA. 

Soluble IR/485 protein was recovered from harvested fermentation 
medium by affinity chromatography on columns of Mab 9E10 essentially as 
described in Example 1. Between 92 -98% of the product was recovered from 
the medium by this affinity-chromatography step, as estimated by ELISA. 

Gel filtration over Superdex 200 (Pharmacia, Sweden), of the affinity- 
purified material at Img/ml produced a dominant protein peak at apparent 
mass —140 kDa (Figure 5a - interpreted as dimer), whereas a peak at apparent 
mass —85 kDa was obtained (Figure 5b - interpreted as monomer) at 0.02 
nig/ml. The protein migrated as a single broad band of apparent molecular 
mass —78 kDa (reduced- lane A) or —68 kDa (non-reduced - lane B) on 
sodium dodecyl sulfate polyacryl amide gel electrophoresis (SDS-PAGE: 
Figure 6a) The IR/485 fragment reacted positively in the ELISA with Mab 83- 
7, gave a single sequence corresponding to the N-terminal 10 residues of IR, 
showing several isoforms on isoelectric focussing from pi 6.0 - 6.8 (Figure 
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eb). Crystallisation screening trials of the fragment produced crystals too 
small for X-ray diffraction studies. The fragment was further purified by ion- 
exchange chromatography on Uno Q (BioRad, USA), using stepwise isocratic 
elution with incremental changes in salt concentrations (Figure 7). Fractions 
5 A and D were each enriched in a component isoform from the ladder of 
isoforms present in the unfractionated mixture (Figure 6b). Both these 
fractions produced crystals, whereas no crystals were obtained from fractions 
B and C. 

Crystals were grown by the hanging drop vapour diffusion method 

10 using purified protein concentrated in Centricon 10 concentrators (Amicon 
Inc, USA) to 5-10 mg/ml in lOmM Tris-HCl pH 8.0 and 0.02% (w/v) azide. A 
search for crystallization conditions was performed initially using the 
factorial screen (Jancarik, J. & Kim, S.-H.,1991, J Appl Cryst 24:409-411) and 
subsequently optimised. Crystals were examined on an M18XHF rotating 

15 anode generator (Siemens, Germany) equipped with Franks mirrors (MSG, 
USA) and an RAXIS IIC image plate detector (Rigaku, Japan). 

From the initial crystallization screen of this protein fraction D fine 
needles grew in about one week. In further experiments, crystals of up to 
0.04 X 0.04 X 0.2 mm could be grown from a solution of 1.9-2.0 M ammonium 

20 sulfate, 2% PEG 400. 0,1 M HEPES pH 7.5. Upon X-ray examination, the 

ciystals diffracted to 4 A and were found to belong to the space group P2i2i2i 
with a = 103.2 A, b = 130.0 A. c = 161.6 A. Despite their small size these 
crystals diffracted sufficiently well to allow collection of a low resolution 
data set. Further purification of the protein and refinement of crystallisation 

25 conditions should yield larger crystals, providing data to determine the 
structure of this fragment at medium resolution or better. 
EXAMPLE 3 

Structure of the IGF-lR/1-462 

Crystals were cryo-cooled to-170°C in a mother liquor containing 20% 
30 glycerol, 2.2 M ammonium sulfate and 100 mM Tris at pH 8.0. Native and 
derivative diffraction data were recorded on Rigaku RAXIS lie or IV area 
detectors using copper Ka radiation from a Siemens rotating anode generator 
with Yale/MSC mirroroptics. The space group was P2i2|2i with a = 77.39 A, 
b = 99.72 A. and c = 120.29 A. Data were reduced using DENZO and 
35 SCALEPACK (Otwinowski, Z. & Minor, W„ 1996, Mode.Meth. Enzym. 

276:307-326). Diffraction was notably anisotropic for all crystals examined. 
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Phasing by multiple isomorphous replacement(MIR) was performed 
with PROTEIN (Steigeman. W. Dissertation (Technical Univ. Munich, 1974) 
using anomalous scattering for both U02 and PIP derivatives. Statistics for 
data collection and phasing are given in Table 1, In the initial MER map 
regions of protein and solvent could clearly be seen but the path of the 
polypeptide was by no means obvious. That map was subject to solvent 
flattening and histogram matching in DM (Cowtan, K.,ig94, Joint CCP4 and 
ESF-EACBM newslett. Protein Ciystallogr. 31:34-38). The structure was 
traced and rebuilt using O [Jones, T. A., et al., 1991, Acta Crystallogr. 
A47:110-119) and refined with X-PLOR 3.851 (Brunger, A. T., 1996, X-PLOR 
ReferenceManual 3.851, Yale Univ., New Haven, CT). After 5 rounds of 
rebuilding and energy minimisation the R-factor dropped to 0.279 and Rfree 
= 0.359 for data 7-2.6 A resolution. The current model contains 458 amino 
acids and 3 N-linked carbohydrates but no solvent molecules. For residues 
with B(Ca) > 70 A2atomic positions are less reliable (37-42, 155-159. 305. 
336-341, 404-406,453-458). There is weak electron density for residues 459- 
461 but the c-myc tail appears completely disordered. 

The 1-462 fragment consists of the N-terminal three domains of IGF- 
IR (Ll, cys-rich, L2) and contains regions of the molecule which dictate 
ligand specificity (17-23). The molecule adopts a reasonably extended 
structure (approximately 40 x 48 x 105 A) with domain 2 (cys-rich region) 
making contact along the length of domain 1 (Ll) but very little contact with 
the third domain (L2) (see Figure 8). This leaves a space at the centre of the 
molecule of approximately 24 A x 24 A x 24 A which is bounded on three 
sides by the three domains of the molecule. The space is of sufficient size to 
accommodate the ligand, IGF-1. 
The L domains 

Each of the L domains (residues 1-150 and300-460) adopt a compact 
shape (24 x 32 x 37 A) consisting of a single -stranded right handed p-helix 
and capped on the ends by short a-helices and disulfide bonds. The body of 
the domain looks like a loaf of bread with the base formed from a flat six- 
stranded p-sheet, 5 residues long and the sides being p-sheets three residues 
long (Figures 8 & 9). The top is irregular but in places is similar for the two 
domains. The two domains are superposable with an rms deviation in Ca 
positions ofl.6 A for 109 atoms (Figure 10). Although this fold is reminiscent 
of other p-helix proteins it is much simpler and smaller with very few 
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elaborations and thus it represents a new superfamily of domains. One 
notable difference between the two domains is that the indole ring of Trp 176 
from the cys-rich region (Figure 9b) is inserted into the hydrophobic core of 
LI and the C-terminal helix is only vestigial (Figure 8). For the insulin 
5 receptor family the sequence motif of residues which form the Trp pocket in 
LI does not occur in L2 (Figure 9a). However in the EGF receptor, which has 
an additional cys-rich region after the L2 domain (14, 15), the pocket motif 
can be found in both L domains and the Trp is conserved in both cys-rich 
regions (Figure 9b). 

10 The repetitive nature of the p-helix is reflected in the sequence and 

the first five turns were correctly identified by Bajaj, M., et al. (1987, 
Biochim.Biophys. Acta 916:220-226), the conserved Gly residues being found 
in turns making one bottom edge of the domain. However, their conclusions 
about the fold were incorrect. The"helix-like" repeat is actually a pair of 

15 bends at the top edge of the domain. In their Motif V, the Gly is not in a 

bend but is followed by the insertion of a conserved loop of 7-8 residues (see 
Figure 9a). Glycine is structurally important in the Gly bends as mutation of 
these residues compromises folding of the receptor [van der Vorm, E.R., et 
aL, 1992, J. Biol. Chem. 267, 66-71; Wertheimer. E. et aL, 1994, J. Biol. Chem. 

20 269, 7587-7592]. 

Upon comparing the L domains with other right-handed p-helix 
structures such as pectate lyase (Yoder, M. D., et al., 1993, .Structure, 1:241- 
251-1507] and the p22 tailspike protein (Steinbacher, S., et al., 1997, J.Mol. 
Biol. 267:865-880) there are some striking similarities as well as differences. 

25 In all cases the ends of the domain are capped bya-helices but the L domains 
also have a disulphide bond at each end to hold the termini. The other P- 
helix domains are considerably longer and have significant twist to their 
sheets while the L domains have flat sheets. Although the sizes of the helix 
repeats are similar (here 24-25 residues vs 22-23 for pectate lyase) the cross- 

30 sections are quite different. The L domains have a rectangular cross-section 
while pectate lyase and p22 tailspike protein are V-shaped and have many, 
and sometimes quite large, insertions (Yoder, M. D., et al., 1993, .Structure, 
1:241-251-1507; Steinbacher, S., etal., 1997, J.iMol. Biol. 267:865-880). In 
the hydrophobic core a common feature is the stacking of aliphatic residues 

35 from successive turns of the p-helix and near the C-terminus of each L 

domain there is also a short Asn ladder, reminiscent of the long Asn ladder 
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observed in pectate lyase (Yoder, M. D., et al., 199 3,. Structure 1:241-251- 
1507). On the opposite side of the L domains the Gly bend as well as the two 
bends and sheet preceding it have no counterpart in the other p-helix 
domains. Thus although the L domains are built on similar principles to the 
other p-helix domains they constitute a separate superfamily. 
The cys-rich domain 

The cys-rich domain is composed of eight disulfide-bonded modules (Figure 
9b), the first of which sits at the end of LI while the remainder make a 
curved rod running diagonally across LI and reaching to L2 (Figure 8). The 
strands in modules 2-7 run roughly perpendicular to the axis of the rod in a 
manner more akin to laminin (Stetefeld, J., et al.,1996, J.MoLBiol. 257:644- 
657 ) than to TNF receptor (Banner, D. W., et al., 1993, Cell, 73:431-445) but 
the modular arrangement of the cys-rich domain is different to other cys-rich 
proteins for which structures are known. The first 3 modules of IGF-lR have 
a common core, containing a pair of disulfide bonds, but show considerable 
variation in the loops (Figure 9b). The connectivity of these modules is the 
same as the first half of EGF (Cys l-3and 2-4) but their structures do not 
appear to be closely related to any member of the EGF family. Modules 4 to 
7 have a different motif, p-finger, and best match residues 2152-2168 of 
fibrillin (Dowling, A. K., et al., 1996, Cell, 85:597-605). Each is composed of 
three polypeptide strands, the first and third being disulfide bonded and the 
latter two forming a p-ribbon. The p-ribbon of each p- finger module lines up 
antiparallel to form a tightly twisted 8-stranded p-sheet (Figures 8 and 11). 
Module 6 deviates from the common pattern with the first segment being 
replaced by an a-helix followed by a large loop that is likely to have a role in 
ligand binding (see below). As module 5 is most similar to module 7 it is 
possible that the four modules arose from serial gene duplications. The final 
module is a disulfide linked bend of five residues. 

The fact that the two major types of cys-rich modules occur 
separately implies that these are the minimal building blocks of cys-rich 
domains found in many proteins. Although it can be as short as 16 residues, 
the motif of modules 4-7 is clearly distinct and capable of forming a regular 
extended structure. Thus cys-rich domains such as these can be considered 
as made of repeat units each composed of a small number of modules. 
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Hormone binding 

Attempts have been made to locate the IGF-1 (and insulin) binding 
site by examining natural (Taylor, S. L, 1992, Diabetes, 41:1473-1490) and 
site-directed mutants (Williams, P. F., et al., 1995, J. Biol. Chem. 270:3012- 
5 3016; Mynarcik, D. C et al, 1996, J. Biol. Chem. 271:2439-2442; Mynarcik, D. 
C, et al., 1997, J. Biol. Chem. 272:2077-2081), chimeric receptors (Andersen, 
A. S., et al., 1990, Biochemistry 29:7363-7366; Gustafson, T. A., & Rutter, W. 
J., 1990, J. Biol. Chem. 265:18663-18667; Schaffer, L., et al.,1993, J. Biol. 
Chem. 268:3044-3047; Schumacher, R., 1993, J. Biol. Chem. 268:1087-1094; 

10 Kjeldsen, T., et al., 1991, Proc. Natl Acad. Sci. USA, 88:4404-4408) and by 
crosslinking studies (Wedekind, F., et al., 1989, Biol. Chem Hoppe-Seyler, 
370:251-258; Fabry, M., 1992, J. Biol. Chem. 267:8950-8956; Waugh, S. M., et 
al., 1989, Biochemistry, 28:3448-3458; Kurose, T., et al., 1994), .J. Biol. 
Chem.269:29190-29197-34). IGF-lR/IR chimeras not only show which 

15 regions of the receptors account for ligand specificity but also provide an 
efficient means of identifying some parts of the hormone binding site. 
Paradoxically regions controlling specificity are not the same for insulin and 
IGF-1. Replacing the first 68 residues of IGF-lR with those of IR confers 
insulin binding ability on the chimeric IGF-lR (Kjeldsen, T., et al., 1991, 

20 Proc. Natl Acad. Sci. USA, 88:4404-4408) and replacing residues 198-300 in 
the cys-rich region of IR with the corresponding residues 191-290 of IGF-lR 
allows the chimeric receptor to bind IGF-1 (Schaffer, L., et aL,1993, J. Biol. 
Chem. 268:3044-3047). Thus a receptor can be constructed which binds both 
IGF-1 and insulin with near native affinity. From the structure it is clear that 

25 if the hormone bound in the central space it could contact both these regions. 

From analysis a series of chimeras examined by Gustafson, T. A., & 
Rutter, W. J. (J. Biol. Chem. 265:18663-18667, 1990) the specificity 
determinant in the cys-rich region can be limited further to residues 223-274. 
This region corresponds to modules 4-6 and includes a large and somewhat 

30 mobile loop (residues 255-263, mean B[Ca atoms] = 57 A2) which extends 
into the central space (see Figure 8). In IR this loop is four residues bigger 
and is stabilised by an additional disulfide bond (Schaffer, L. & Hansen, 
P.H..1996, Exp. Clin. Endocrinol. Diabetes, 104: Suppl. 2, 89). The larger 
loop of IR may sei-ve to exclude IGF-1 from the hormone binding site but 

35 allow the smaller insulin molecule to bind. It is interesting to note that 
mosquito IR homologue, which has a loop two residues larger than the 
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mammalian ERs, also appears to bind insulin but not IGF-1 (Graf, R., et al., 
1997, Insect Molec.Biol. 6:151-163). Analysis of the structure indicates that 
the insulin/IGF-1 specificity is controlled by residues in this loop (amino 
acids 253-272 in IGF-lR; amino acids 260-283 in IR) 

As chimeras only address residues which differ between the two 
receptors a more precise analysis of the site can be obtained from single site 
mutants. In particular, from an alanine-replacement study, four regions of Ll 
important for insulin binding were identified (Williams, P. F., et al., 1995, J. 
Biol. Chem. 270:3012-3016). The first three are at similar positions on 
successive turns of the b-helix and the fourth lies on the conserved bulge on 
the large b-sheet (Figure 12). Thus there is a footprint for insulin binding to 
the Ll domain which lies on the first half of large b-sheet facing into the 
central space. Residues further along the sheet which are conserved in IGF- 
lR and could also be important. The conservative substitution of leucine for 
methionine at residue 119 of IR (113 of IGF-lR) causes a mild form of 
leprechaunism [Hone, J. et al, 1994, J. Med. Genet. 31, 715-716]. This 
residue is buried and the mutation could perturb neighbouring residues to 
affect insulin binding. 

The axis of the L2 domain is perpendicular to that of the Ll domain 
and N- terminal end of its p-helix is presented to the hormone-binding site. 
On this face of the L2 domain the only mutation studied so far is the 
naturally occurring IR mutant, S323L, which gives rise to Rabson-Mendehail 
syndrome and severe insulin resistance (Roach, P., 1994, Diabetes 43:1096- 
1102). As this mutant only affects insulin binding and not cell-surface 
expression, residue 323 of IR (residue 313 of IGF-lR) is probably at or near 
the binding site. Structurally this residue lies in the middle of a region 
(residues 309-318 of IGF-lR) which is conserved in both IR and IGF-lR and 
the surrounding region, 332-345 (of IGF-lR), is also quite well conserved in 
the these receptors (Figure 9a). Therefore this region is quite likely to form 
part of the hormone-binding site but would not have been detected by 
chimeras. It is interesting to note that in this region IRR is not as well 
conserved as the other two receptors (Shier, P. & Watt, V.M.. 1989, 
J.Biol. Chem. 264:4605-14608). 

The distance from this putative hormone-binding region on L2 to that 
found on Ll is about 30 A (Figure 8). Thus Ll and L2 appear too far apart to 
bind IGF-1 or insulin. However, in the crystal structure there is a deep cleft 
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between part of the cys-rich domain (residue 262)and L2 (residue 305} and 
this cleft is occupied by a loop from a neighbouring molecule. Thus it seems 
probable that the position of the L2 domain in the receptor structure or the 
hormone-receptor complex adopts a different position with respect to the 
cys-rich domain than that found in the crystal. The movement required to 
bring L2 sufficiently close to Ll is small, namely a rotation of approximately 
25° about residue 298. 

A number of IR mutants have been identified which constitutively 
activate the receptor and the majority of these are found in the a chain. 
Curiously all a chain mutants involve changes to or from proline or the 
deletion of an amino acid, implying that they cause local structural 
rearrangements. The mutation R86N is similar to wild type but R86P reduces 
cell-surface expression and insulin binding while constitutively activating 
autophosphorylation [Gr0nskov, K. et al., 1993, Biochem. Biophys. Res. 
Commun. 192, 905-9111. The proline mutation probably disturbs residues 
preceding 87 which lie in the interface between the Ll and cys-rich domains 
but it could also affect insulin binding. In the cys-rich domain residues 233, 
281, 244 and 247 of IR are not conserved in IGF-lR (Figure 9b) yet L233P 
[Klinkhamer, M.P. et al., 1989, EMBO J. 8, 2503-2507], deletion of N281 
[Debois-Mouthon, C. et al., 1996, J. Clin. Endochronol. Metab. 81, 719-727] or 
the triple mutant P243R, P244R and H247D [Rafaeloff, R. et al, 1989, J. Biol. 
Chem. 264, 15900-15904] cause constitutive ]dnase activation. Due to their 
locations each of these three mutants appears likely to compromise the 
folding of a p -finger domain and, in turn, the structural integrity of the rod- 
like cys-rich domain. The structural ramifications of these mutations could 
be significant for the whole receptor ectodomain as disturbing the Ll/cys-rich 
interface or distorting the rod-like domain could affect the relative position of 
Ll and the cys-rich domain in this context. 

Ll has been further implicated as deletion of K121 on the opposite 
side of Ll from the cys-rich domain was also found to cause 
autophosphoiylation [Jospe, N. et al., 1994, J. Clin. Endochronol. Metab. 79, 
1294-1302]. By contrast this mutation does not affect insulin binding. Thus a 
possible mechanism emerges for insulin binding and signal transduction. 
When insulin binds between Ll and L2 it modifies the relative position of Ll 
and the cys-rich domain in the receptor, perhaps by hinge motion between L2 
and the cys-rich domain like that suggested above, and the structural 
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rearrangement is transmitted across the plasma membrane. In the absence of 
insulin the same signal can be initiated by mutations in the cys-rich region or 
at the Ll/cys-rich interface but at the expense on insulin binding. The signal 
can also be initiated more directly by mutations on the opposite side of Ll 
which affect the interaction of Ll with other parts of the ectodomain, 
possibly the other half of the receptor dimer. 
Ligand Studies 

Although there is no structural information about an IGF-l/IGF-lR 
complex a number of studies have probed the nature of this interaction. 
Results from cross-linking experiments with IGF-1 and insulin and their 
cognate receptors are consistent with the hormone binding site proposed 
above. For example B29 of insulin can be cross-linked to the cys-rich region 
(residues 205-316( (Yip, C. C, et aL, 1988, Biochim. Biophys. Res. Commun. 
157:321-329) or the Ll domain (Wedekind, F., et al., 1989, Biol. Chem Hoppe- 
Seyler, 370:251-258). However these two regions are reasonably well 
separated and those studies may indicate that B29 is mobile. Other studies 
unfortunately do not map the site any more precisely. 

Analogues and site-directed mutants of IGF-1 and -2 have been more 
fruitful. Relative to insulin IGF-1 and -2 contain two extra regions, the C 
region between B and A and a D peptide at the C-terminus. For IGF-1 
replacement of the C region by a four Gly linker reduced affinity for IGF-lR 
by a factor of 40 but increased affinity for IR 5-fold (Bayne, M.L.,et al., 1988, 
J. Biol. Chem. 264:11004-11008). Changes in affinity are consistent with the 
deletion in IGF-1 complementing differences in the cys-rich regions of IGF- 
lR and IR noted above. Mutation of residues either side of the C region 
(residue 24 for IGF-1 [Cascieri, M.A., et al., 1988, Biochemistry 27:3229- 
3233], residues 27,43 for IGF-2, [Sakano, K., et al., 1991, J. Biol. Chem. 
266:20626-20635]) also have deleterious effects on the affinity of the 
hormone forlGF-lR as has truncation of the nearby D peptide in IGF-2 (Roth, 

B. V., etal., 1991, Biochem. Biophys. Res. Commun. 181:907-914). InsuHn 
has been extensively mutated. Binding studies [summarised in Kristensen, 

C. et al., 1997, J. Biol. Chem. 272, 12978-12983] indicate that insulin may 
bind its receptor via a hydrophobic patch (residues A2, A3, A19, B8, Bll, 
B12, B15 and possibly B23 & B24). However this patch is normally buried 
and requires the removal of the B chain's C-terminus from the observed 
position. Assuming IGF-1, -2 and insulin bind their receptors in the same 
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orientation, these data suggest an approximate orientation for the hormone 
when bound to the receptor. 

One notable feature of IGF-1 and -2 is the large number of charged 
residues and their uneven distribution over the surface. Basic residues are 
predominantly found in the C region and, in solution, this region is not well 
ordered in either IGF-1 or -2 (Sato, A., et al., 1993, Int J Peptide Protein Res. 
41:433-440; Torres, A. M., et al., 1995,J. MoL Biol. 248:385-401). In contrast 
the binding site of the receptor has a sizable patch of acidic residues in the 
corner where the cys-rich domain departs from Ll. Other acidic residues 
which are specific to this receptor are found along the inside face of the cys- 
rich domain and the loop (residues 255-263) extending from module 6. Thus 
it is possible that electrostatics play an important part in IGF-1 binding with 
the C region binding to the acidic patch of the cys-rich region near Ll and the 
acidic patch on the other side of the hormone directed towards a small patch 
of basic residues (residues 307-310) on the N-terminal end of L2. 

Although the structure of this fragment gives significant information 
about the nature of the hormone binding site, residues outside this region 
have also been shown to affect binding of ligand. A number of studies have 
implicated residues 704-715 of IR (Mynarcik, D. C et al., 1996, J. Biol. Chem. 
271, 2439-2442; Kurose, T., et al., 1994, J. Biol. Chem.269:29190-29197). 
These residues could contact insulin on one of the sides left open in the 
current structure. Using insulin labelled at the Bl residue, Fabry, M., et 
al.,(1992, J. Biol. Chem. 267:8950-8956) cross linked insulin to the fragment 
390-488, part of which is not near the site as described. The explanation for 
this could be either 488 reaches back to the hormone binding site, or this 
region could contact another hormone bound to the other half of the receptor. 
Further structural information is needed to establish how these other regions 
contact the hormone and to elucidate how binding of the hormone is 
communicated to the kinase inside the cell. 

The structure of the Ll-cys-rich-L2 domains of IGF-IR presented here 
represents the first structural information for the extracellular portion of a 
member of the insulin receptor family. The L domains display a novel fold 
which is common to the EGF receptor family and the modular architecture of 
the cys-rich domain implies that smaller building blocks should be used to 
describe the composition of cysteine-rich domains. This fragment contains 
the major specificity determinants of receptors of this class for their ligands. 
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It has an elongated structure with a space in the middle which could 
accommodate the ligand. The three sides of this site correspond to regions 
which have been implicated in hormone binding. Although other sites are 
present in the receptor ectodomain which interact with the ligand this 
structure gives us an initial view of how the insulin, IGF-1 and -2 might 
interact with their cell surface receptors to control their metabolic and 
mitogenic effects 

Such information will provide valuable insight into the structure of 
the corresponding domains of the IR and insulin receptor-related receptor as 
well as members of the related EGFR family (Bajaj, M., et al., 1987, Biochim 
Biophys Acta 916:220-226; Ward, C. W. et aL, 1995, Proteins: Struct Funct 
Genet 22:141-153). 
EXAMPLE 4 

Prediction of 3D Structure of the Corresponding Domains of IRR and IR 
Based on Structure of IGF-lR Frgament 

The sequence identities between the different members of the insulin 
receptor family are sufficient to allow accurate sequence alignments to 
facilitate 3D structure predictions by homology modelling. The alignments of 
the ectodomains of human IGF-lR, IR, and IRR are shown in Figure 13. 



EXAMPLE 5 

Prediction of 3D Structure of EGFR and its Family Members ERB2, ERB3 
and ERB4. 

The sequence identities between the different members of the EGFR 
receptor family and the insulin receptor family are sufficient to allow 
accurate sequence alignments to facilitate 3D structure predictions by 
homology modelling. The alignments of the ectodomains of human EGFR, 
ERB2, ERB3 and ERB4 are shown in Figure 14. The ectodomains of the EGFR 
family members are composed of four domains : LI domain, cys-rich domain, 
L2 domain and a second cys-rich domain all of which can be modelled from 
the structure of the IGF-lR fragment residues 1-462. 

The sequence alignment analysis and characterization of the repeat 
modules in the cys-rich region of IGF-lR and the homologous regions of the 
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IR, IRR and the first and second cys-rich regions of EGFR, ErbB2, ErbBS and 
ErbB4 are shown in Figure 15. A representative of each subt5^e of cys repeat 
is found in the IGF-lR fragment 1-462 and is used to model each of these 
modules in the other receptors. Note the nature and order of modules in the 
5 second cys-rich repeat of the EGFR family is different to that seen in the first 
cys-rich region. 
EXAMPLE 6 

Single-Molecule Imaging of Human Insulin Receptor Ectodomain and its 
Fab Complexes 
10 Cloning and expression of hIR -11 ectodomain protein 

A full length clone of the human IR exon -11 form (hIR -11) was 
prepared by exchanging an Aat II fragment, nucleotides 1195 to 2987 , of the 
exon +11 clone (plasmid pET; Ellis et al., 1986; gift from Dr W. J. Rutter, 
UCSF) of hIR (Ebina et al., 1985, Cell 40. 747-758} with the equivalent Aat H 

15 fragment from a plasmid (pHIR/P12-l, ATCC 57493) encoding part of the 

extracellular domain and the entire cytoplasmic domain of hIR -11 (Ullrich 
et al., 1985, Nature 313 , 756-761). The ectodomain fragment of hIR -11 
(2901 bp, coding for the 27 residue signal sequence and residues Hisl- 
Asn914) was produced by Sail and Sspl digestion and inserted into the 

20 mammalian expression vector pEE6,HCNIV-GS (Celltech Limited, Slough, 
Berkshire, UK) into which a stop codon linker had been inserted, as 
described previously (Cosgrove et al., 1995, Protein Expression and 
Purification 6, 789-798) for the hIR exon +11 ectodomain. 

The resulting recombinant plasmid pHIR II (2 \ig) was transfected 

25 into glycosylation deficient Chinese hamster ovary (Lec 8) cells (Stanley, 
1989, Molec, Cellul Biol. 9, 377-383) with Lipofectin (Gibco-BRL). After 
trans fee tion, the cells were maintained in glutamine-free medium GMEM 
(ICN Biomedicals. Australia) as described previously (Bebbington & 
Hentschel, 1987, In DMA Cloning (Glover, D,, ectodomain.), Vol HI, Academic 

30 Press, san Diego; Cosgrove et al., 1995, Protein Expression and Purification 6, 
789-798). Expressing cell lines were selected for growth in GMEM with 25 
|.tM methionine sulphoximine (MSX, Sigma). Transfectants were screened for 
protein expression using sandwich ELISA with anti-IR monoclonal antibodies 
83-7 and 83-14. Metabolic labelling of cells, immunoprecipitations, insulin 

35 binding assays and Scatchard analyses were performed as described 
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previously for the exon +11 form of hIR ectodomain (Cosgrove et al., 1995, , 
Protein Expression and Purification 6, 789-798). 
hlR -11 ectodomain productioii and purification 

The selected clone (inoculum of 1.28 x 108 cells) was grown in a 
spinner flask packed with 10 g of Fibra-cel disc carriers (Sterilin, U.K.) in 500 
ml of GN'IEM medium containing 10% fetal calf serum (FCS) and 25 jxM MSX. 
Selection pressure was maintained for the duration of the culture. 

Ectodomain was recovered from harvested media by affinity 
chromatography on inmiobilized insulin and further purified by gel filtration 
chromatography on Superdex S200 (Pharmacia; 1 x 40 cm) in Tris-buffered 
saline containing 0.02% sodium azide (TBS A) as described previously 
(Cosgrove et al., 1995, Protein Expression and Purification 6, 789-798). 
Solutions of purified hIR -11 ectodomain were stored at 4" C prior to use. 
Production of Fab fragments and their complexes with ectodomain 

Purification of Mabs 83-7, 83-14 and 18-44 from ascites fluid by 
affinity chromatography using Protein A-Sepharose, and the production of 
Fabs, were based on the methodologies described in Coligan et al.,1993, 
Current Protocols in Immunology, Vol 1, pp 2.7.1-2.8.9, Greene Publishing 
Associates & Wiley - Interscience, John Wiley and Sons, Fab was produced 
from monoclonal antibody by mercuripapain digestion for 1-4 h, followed by 
gel filtration on Superdex S200. Products were monitored by reducing and 
non-reducing SDS-PAGE. For 83-7 Mab, an IgG Type 1 monoclonal antibody, 
the bivalent (Fab) 2' isolated by this method was reduced to monovalent Fab 
83-7 by mild reduction with m^'I L-cysteine.HCl in 100 rnM Tris pH 8.0 
(Coligan et al., 1993, Current Protocols in Immunology, Vol 1, pp 2.7.1-2.8.9, 
Greene Publishing Associates & Wiley - Interscience, John Wiley and Sons). 

Complexes of Fab with hIR -11 ectodomain were produced by mixing 
— 2.5 to 3.5 molar excess of Fab with hIR -11 ectodomain at ambient 
temperature in TBSA at pH 8.0. After 1-3 h, the complex was separated from 
unbound Fab by gel filtration over a Superdex S200 column in the same 
buffer. 

Electron microscopy 

Uncomplexed hIR -11 ectodomain and the Fab complexes described 
above were diluted in phosphate-buffered saline (PBS) to concentrations of 
the order of 0.01-0.03 mg/ml. Prior to dilution, 10% glutaraldehyde (Fluka) 
was added to the PBS to achieve a final concentration of 1% glutaraldehyde. 
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Droplets of — 3ml of this solution were applied to thin carbon film on 700- 
mesh gold grids after glow-discharging in nitrogen for 30 s. After 1 min. the 
excess protein solution was drawn off and followed by application and 
withdrawal of 4-5 droplets of negative stain [2% uranyl acetate (Agar), 2% 
uranyl formate ( K and K), 2% potassium phosphotungstate (Probing and 
Structure) adjusted to pH 6.0 with KOH, or 2% methylamine tungstate (Agar) 
adjusted to pH 6.8 with NH40H]. In the case of both uranyl acetate and 
uranyl formate staining, an intermediate wash with 2 or 3 droplets of PBS 
was included prior to application of the stain. The grids were air-dried and 
then examined at 60kV accelerating voltage in a JEOL lOOB transmission 
electron microscope at a magnification of 100,000x, It was found that there 
was a typical thickness of negative stain in which Fabs were most easily 
seen, hence areas for photography had to be chosen from particular zones of 
the grid. Electron micrographs were recorded on Kodak SO-163 film and 
developed in undiluted Kodak Dl9 developer. The electron-optical 
magnification was calibrated under identical imaging conditions by recording 
single-molecule images of the antigen-antibody complex of influenza virus 
neuraminidase heads and NClO N/IFab (Tulloch et aL, 1986, /.MoA Biol. 190, 
215-225; Malby et al., 1994, Structure, 2, 733-746). 
Image processing 

Electron micrographs showing particles in a limited number of 
identifiable projections were chosen for digitisation. Micrographs were 
digitised on a Perkin-Elmer model 1010 GMS PDS flatbed scanning 
microdensitometer with a scanning aperture (square) size of 20 mm and 
stepping increment of 20 mm corresponding to a distance of 0.2 nm on the 
specimen. Particles were selected from the digitised micrograph using the 
interactive windowing facility of the SPIDER image processing system (Frank 
et aL, 1996, /. Struct, Biol 116, 190-199). Particles were scaled to an optical 
density range of 0.0 - 2.0 and aligned by the PSPC reference-free alignment 
algorithm (Marco et al., 1996, Ultramicroscopy, 66, 5-10). Averages were then 
calculated over a subset of correctly aligned particles chosen interactively as 
being representative of a single view of the particle. The final average image 
presented here is derived from a library of 94 images. 
Biochemical characterization of expressed hIR -11 ectodomain 

The recombinant protein examined corresponded to the the first 914 
residues of the 917 residue ectodomain of the exon -11 form of the human 
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insulin receptor (Ullrich et al., 1986, Nature 313 , 756-761). Expressed protein 
was shown, by SDS-PAGE and autoradiography of immunoprecipitated 
product from metabolically labelled cells, to exist as a homodimeric complex 
of —270 - 320 kDa apparent mass, which dissociated under reducing 
5 conditions into monomeric a and P' subunits of respective apparent mass 
— 120 kDa and —35 kDa (data not shown). 

Purified hIR -11 ectodomain, expressed in Lec8 cells and purified by 
affinity chromatography on an insulin affinity column, ran as a symmetrical 
peak on a Superdex S200 gel filtration column (Figure 16). The protein eluted 

10 with an apparent mass of —400 kDa, calculated from a standard curve 
generated by the elution positions of standard proteins (not shown). As 
expected for protein expressed in Lec 8 cells, whose glycosylation defect 
produces truncated oligosaccharides (Stanley, 1989, . Molec. Cellul. Biol. 9, 
377-383), this value is less than the apparent mass (450 - 500 kDa) reported 

15 for hIR +11 ectodomain expressed in wild-type CHO-Kl cells (Johnson et al., 
1988, Proc. Natl Acad, Sci USA 85, 7516-7520; Cosgrove et al., 1995, Protein 
Expression and Purification 6, 789-798). 

Radioassay of insulin binding to purified ectodomain gave linear 
Scatchard plots and Kd values of 1.5 - 1.8 x 10-9 M, similar to the values of 

20 2.4 - 5.0 X 10-9 M reported for the hIR -11 ectodomain (Andersen et aL, 1990, 
Biocliemistry 29, 7363-7366; Markussen et al., 1991, /. Biol. Cham. 266, 
18814-18818: Schaffer, 1994, Eur. J. Biochem. 221, 1127-1132) and the values 
of —1.0 - 5.0 X 10-9 M reported for the hIR +11 ectodomain (Schaefer et al., 
1992, /. Biol. Chem. 267, 23393-23402; Whittaker et al., 1994, Molec. 

25 En docrin oL 8, 1521-1527;Cosgroveetal., 1995, Protein Expression an d 
Purification 6, 789-798). 
Expression of hIGF-lR ectodomain 

Cloning, expression and purification of this protein used elements 
common to those described for hIR -11 ectodomain (Cosgrove et al., 1995, 

30 Protein Expression and Purification 6, 789-798) and resulted in purified 

product that was recognised by receptor-specific Mabs 17-69, 24-31 and 24-60 
(Soos et aL, 1992,/. Biol. Chem. 267, 12955-63) and was composed of a and 
p' subunits of mass similar to those of hIR ectodomain (unpublished data). 
Preparation of hIR -11 ectodomain/NlFab complexes 

35 A complex of hIR -11 ectodomain and Fab from antibody 83-14 eluted 

as a symmetrical peak of 460 -500 kDa (Figure 16), as did complexes 
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generated from a mixture of hIR -11 ectodomain with Fab from antibody 18- 
44 and a mixture of hIR -11 ectodomain with Fab 83-7 (not shown). A co- 
complex of ectodomain with Fabs from antibodies 18-44 and 83-14 eluted at 
- 620 kDa (Figure 12), as did a co-complex with MFabs 83-14/83-7 and 
5 another with MFabs 83-7/18-44 (not shown). A complex of hIR -11 

ectodomain with all three MFab derivatives, 18-44, 83-7 and 83-14, eluted at 
an apparent mass of — 710 kDa (Figure 16). 
Electron microscopy 

Imaging of hIR -11 and hIGF-lR ectodomains 

10 Single-molecule imaging of undecorated dimeric hIR -11 ectodomain 

was carried out under a variety of negative staining conditions, which 
emphasised different aspects of the structure of the molecular envelope. The 
least aggressive or penetrative stain was potassium phosphotungstate (KPT) , 
which revealed consistent globular particles with very little internal structure 

15 other than a suggestion of a division into two parallel bars. Staining with 
methylamine tungstate also revealed the parallel bar images, as shown in 
Figure 17a. 

Further investigation using progressively more penetrative, but also 
potentially more disruptive, stains confirmed the observations above. 

20 Staining with uranyl acetate and uranyl formate showed the separation of the 
parallel bars most clearly (Figure 17b), but uranyl acetate showed evidence 
of disrupting the structure of the particles, i.e. a decrease in the consistency 
of the particle shape and a tendency for particles to look unravelled or 
denatured despite having been subjected to chemical cross-linking prior to 

25 staining. In areas of thicker stain, parallel bars predominated (Figure 17b ), 
whereas in more thinly stained regions, U-shaped particles could be 
identified, sometimes outnumbering the parallel-bar structures (Figure 18a). 
An averaged image of the parallel bars seen by staining hIR -11 ectodomain 
with uranyl formate is shown as an insert in Figure 17b. 

30 In Figures 17c and 18b. images of hIGF-lR ectodomain are shown for 

comparison with Figure 17b and 18a, respectively, under similar staining 
conditions. 

Imaging of hIR -11 ectodomain complexed with 83-7 MFab 

This complex was particularly noteworthy for the consistency of the 
35 form of the particles, especially under the gentler staining conditions 

afforded by stains such as KPT and methylamine tungstate. The particles 
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were interpreted as having been restricted in the views they presented, sifter 
air-drying on the carbon support film, by the almost diametrically opposite 
binding of the two Fab arms to the antigen to form a highly elongated 
complex structure. Under these conditions three distinct views could be 
5 recognised as shown in Figure 19. Two views (interpreted as top- 
down/bottom-up) show the Fab arms displaced clockwise or anti-clockwise as 
extensions of the parallel plates with two-fold symmetry. The third view 
shows an image with the two Fab arms in line roughly through the centre of 
the receptor on its opposite sides, interpreted as a side projection of binding 

10 half-way up the plates (Figure 19). 

Figure 20 shows a field of particles of hIR -11 ectodomain complexed 
with 83-7 N'IFab, stained with uranyl formate.The use of the more aggressive 
uranyl stains operating at lower pHs revealed internal structure of the 
molecular envelope at the expense of consistency of the particle morphology. 

15 For example, staining with uranyl acetate or uranyl formate showed that 
parallel bars can be seen in particles in which the Fab arms are displaced 
either clockwise or anticlockwise but not where the intermediate central or 
axial position of the two Fab arms is presented in projection. These 
observations show 83-7 NJDFab binding roughly half-way up the side-edge of 

20 each hIR -11 ectodomain plate. The epitope recognised by Mab 83-7 has been 
mapped to the cys-rich region, residues 191-297, by analysis of chimeric 
receptors (Zhang and Roth, 1991, Proc. Natl. Acad, ScL USA 88, 9858-9862). 



25 Imaging of hIR -11 ectodomain complexed with either 83-14 MFab or 18-44 
MFab 

Figure 21a shows the complexes formed with Fabs from the most 
insulin-mimetic antibody Mab 83-14. Projections showing the Fab arms 
bound to and extending out from near the base of the U-shaped particles can 

30 be identified. A second field of particles (Figure 21b) shows objects 

composed of two parallel bars as observed for the undecorated ectodomain, 
with Fab arms projecting obliquely from diametrically opposite extremities. 
Similar but less definitive images were also seen when MFab 18-44 was 
bound to hIR -11 ectodomain (not shown). The epitope for Mab 83-14 is 

35 between residues 469-592 (Pngent et al., 1990) in the connecting domain. 

This domain contains one of the disulphide bonds (Cys524-Cys524) between 
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the two monomers in the IR dimer (Schaffer and Ljungqvist, 1992, Biochem, 
Biophys. Res, Commun. 189, 650-653). The epitope for Mab 18-44 is a linear 
epitope, residues 765-770 (Prigentet al., 1990, BioL Chem, 265, 9970-9977) 
in the p-chain, near the end of the insert domain (O'Bryan et al., 1991, MoL 
5 Cell BioL 11, 5016-5031). The insert domain contains the second disulphide 
bond connecting the two monomers in the IR dimer (Sparrow et al., 1997,/. 
BioL Chem., 272, 29460-29467). 

Imaging of hIR -11 ectodomain co-complexed with two different MFabs per 
monomer 

10 The double complex of hIR -11 ectodomain with MFabs 83-7 and 18- 

44 was stained with 2% KPT at pH 6.0, and revealed the molecular 
envelopes shown in Figure 22. The particle appears complex in shape and 
can assume a number of different orientations on the carbon support film, 
giving rise to a number of different projections in the micrograph. The 

15 predominant view is of an asymmetric X-shape (some examples circled). It 
shows the 83-7 MFab arms bound at opposite ends of the parallel bars with 
the two 18-44 MFabs appearing as shorter projections extending out from 
either side of each ectodomain. 

Images of the double complex of hIR -11 ectodomain with 83-7 and 

20 83-14 MFabs gave X-shaped images similar to those seen with the 83-7/18-44 
double complex (not shown). In contrast the double complex of hIR -11 
ectodomain with 18-44 and 83-14 MFabs did not present the characteristic 
asymmetric X-shapes described above (images not shown). Instead, the 
molecular envelope appeared to be elongated in many views, with only an 

25 occasional X-shaped projection. While a detailed interpretation of these 
images would be premature, it is clear that MFabs 18-44 and 83-14, two of 
the more potent insulin mimetic antibodies (Prigent et al., 1990,/. BioL 
Chem. 265, 9970-9977), can bind simultaneously to the receptor. 
Imaging of hIR -11 ectodomain co-complexed with three different MFabs 

30 per monomer 

Figure 23 shows a field of particles from a micrograph of hIR -11 
ectodomain complexed simultaneously with MFabs 83-7, 83-14 and 18-44. In 
the thicker stain regions the molecular envelope is X-shaped. and looks veiy 
similar to that of the double complexes of hIR -11 ectodomain with either 83- 

35 7 and 18-44 or 83-7 and 83-14. However, in the more thinly stained regions, 
particles of greater complexity are visible and it is possible occasionally to 
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identify that there are in fact more than four MFabs bound to the ectodomain 
dimer. 

The single-molecule imaging of hIR -11 ectodomain presented here 
suggests a molecular envelope for this dimeric species significantly different 
from that of any previously published study. However, an unequivocal 
determination of the molecular envelope even from the present study is not 
entirely straightforward. A major complicating factor here has been the 
relative fragility of the expressed ectodomain when exposed to the rigors of 
electron microscope preparation by negative staining. For example, staining 
with potassium phosphotungstate ( KPT, pH 6.0-7.0) frequently suggested a 
denaturation of the dimeric molecules, but when appropriate conditions were 
satisfied, good seemingly interp re table molecular envelope images were 
achieved; staining with methylamine tungstate ( pH —7.0) supported the best 
KPT molecular envelope images, but had the suggestion of a swelling of the 
molecular structure at neutral pH; and the acid-pH stains of uranyl acetate ( 
pH —4.2) and uranyl formate ( pH~3.0), with their ability to penetrate the 
ectodomain structure, appeared to illuminate not so much the molecular 
envelope as the zones of high projected protein density within the dimer. 

An amalgam of impressions from these various staining regimens has 
led to the following interpretation of single-molecule images of these 
undecorated, or naked, dimers: the predominant dimeric molecular image 
encountered here has been that of 'parallel bars'of projected protein density. 
This view is so predominant, indeed, that it suggests there is either a single 
preferred orientation of the molecules on the glow-discharged carbon support 
film, or that this impression of parallel bars of density may represent a 
mixture of superficially similar structure projections, with the subtleties of 
these different projections being masked by the relatively coarse resolution of 
this single-molecule direct imaging. The impression of parallel bars of 
projected protein density is particularly predominant in regions of thicker 
negative stain. A second view of the molecular envelope, appreciably less 
well represented in regions of thicker stain but predominant in regions of 
thin staining, is that of 'open' U's, or Vs. These two views of hIR -11 
ectodomain were supported by the single-molecule imaging of hIGF-lR 
ectodomain under comparable conditions of negative staining. 

If the assumption is made that these two recognisable projected 
views, that of parallel bars and of open U'sA^'s, are different views of the 
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same dimeric molecule, an assumption strongly supported by the MFab 
complex imaging, a coarse model of the molecular envelope can be 
rationalized as in the schematic Figure 24. The model structure is roughly 
that of a cube, composed of two almost-parallel plates of high protein 
5 density, separated by a deep cleft of low protein main-chain and side-chain 
density able to be penetrated by stain, and connected by intermediate stain- 
excluding density near what is assumed here to be their base ( that is, 
nearest the membrane-anchoring region). The width of the low-density cleft 
appears to be of the order of 30-35A, sufficient to accommodate the binding 

10 of the insulin molecule of diameter ca. 3oA, although we have no electron 
microscopical evidence to support insulin-binding in this cleft at this stage. 

It has been established through imaging of bound 63-7 MFab that 
there is a dimeric two-fold axis normal to the membrane surface between 
these plates of density. Occasionally, dimer images display a relative 

15 displacement of the bars of density, interpreted here as a limited capacity for 
a shearing of the interconnecting zone between the two plates along their 
horizontal axis parallel to the membrane; other images show bars skewed 
from parallel, implying a limited capacity for the plates to rotate 
independently around the two-fold axis, again via this interconnecting zone. 

20 These two observations each suggest a relatively flexible connectivity 

between the dimer plates in the membrane-proximal region of intermediate 
protein density, which could possibly contribute to the transmembrane 
signalling process. 

The approximate overall measured dimensions of the ectodomain 

25 dimer depicted in Figure 24 are 110 x 90 x 120A, calibrated against the 
dimensions of imaged influenza neuraminidase heads, known from the 
solved X-ray structure (Varghese et al., 1983, Nature 303, 35-40). It can be 
noted that there is a compatibility here between the molecular weights and 
molecular dimensions of these two molecular species: the compact 

30 tetrameric influenza neuraminidase heads of Mr —200 kDa occupy a volume 
almost 100 x 100 x 60 A; the more open dimeric insulin receptor ectodomains 
of similar Mr —240 kDa imaged here occupy a volume approximately 110 x 
90 X 120 A . roughly twice that of the neuraminidase heads, accommodating 
the slightly higher molecular weight and substantial central low-density cleft. 

35 The low-resolution roughly cubic compact structure proposed here 

differs substantially from the T-shaped model proposed by Christiansen et al. 
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(1991, Proc, Natl. Acad ScL U. S. A 88, 249-252) and Tranum-Jensen et al., 
(1994, /. Membrane Biol. 140, 215-223) for the whole receptor and the 
elongated model proposed by Schaefer et al. (1992,/. Biol. Chem. 267, 23393- 
23402) for soluble ectodomain. Significantly, those previous studies did not 
provide any convincing independent electron microscopical evidence that 
their imaged objects were in fact insulin receptor. 

In the present study, the identity of the imaged molecules as hIR -11 
ectodomain has been confirmed by imaging complexes of the dimer with 
Fabs of the three well-established conformational Mabs against native hIR, 
83-7, 83-14 and 18-44 (Soos et al.,1986, Biochem. J. 235, 199-208; 1989, Proc. 
Natl Acad. Sci. USA 86, 5217-5221), bound singly and in combination. In all 
these instances, virtually every particle in the field of view exhibited MFab 
decoration through binding to conformational epitopes, establishing not only 
the identity of the imaged particles but also the conformational integrity of 
the expressed ectodomains. Furthermore, the cleanliness and uniformity of 
these hIR -11 ectodomain preparations, both naked and decorated, visualised 
here by electron microscopy demonstrate their high suitability for X-ray 
ci-ystallization trials. 

The known flexibility of the Fab arms exacerbates image-to-image 
variability beyond the limited extent already described for the undecorated 
dimeric ectodomains, complicating any precise interpretation of these 
antigen-antibody complexes. Such molecular flexibility also renders largely 
impractical any single-molecule computer image averaging to facilitate image 
interpretation, progressively more so with the higher order antigen-antibody 
complexes studied here. 

The most readily interpretable of these images, showing least image- 
to-image variability, are those of 83-7 NIFab bound to dimers where, 
fortuitously, the antigen-antibody complex is constrained in its degrees of 
rotational freedom on the carbon support film. Many projected images show 
the two Fab arms in line roughly through the centre of the antigen on its 
opposite sides (Figure 19, arrowed examples )., interpreted as a side 
projection of binding half-way up the plates from their membrane-proximal 
base. Other sub-sets of images (Figure 19, circled examples ) show the two 
Fab arms still parallel but displaced clockwise or anticlockwise with 2-fold 
symmetry, each Fab approximating an extension of one of the parallel bars of 
antigen density, interpreted here as representing top or bottom projections 



48 

along the 2-fold axis. The third projection, along the axis of the Fab arms, 
could not be sampled here because of the constraining geometry of this 
molecular complex. These observations suggest binding of 83-7 MFab 
roughly half-way up the side-edge of the hIR -11 ectodomain plate. This then 
5 allows an initial attempt at spatially mapping the 83-7 MFab epitope, which 
has been sequence-mapped to residues 191-297 in the cys-rich region of the 
insulin receptor (Zhang and Roth, 1991, Proc, Natl. Acad. Sci. USA 88, 9858- 
9862). The spatial separation and relative orientations of the two binding 
epitopes of Mab 83-7 on the hIR -11 ectodomain dimer as indicated here 

10 appear inconsistent with the proposal that Mab 83-7 could bind 

intramolecularly to hIR (O'Brien et al., 1987, Biochem J. 6, 4003-4010). 

Decoration of the ectodomain dimer with 83-7 MFab established that 
the two plates of high protein-density are arranged with 2-fold symmetry. 
Decoration with either 83-14 or 18-44 MFab , on the other hand, allowed 

15 sampling of the third projection of the ectodomain dimer precluded by 83-7 
MFab binding. Significantly, this third view established unequivocally the U- 
shaped projection of the hIR -11 ectodomain dimer, something which was 
only able to be assumed with the undecorated ectodomain images. Further, 
this projection has allowed a rough spatial mapping close to the base of the 

20 U-shaped dimer for the epitopes recognised by 83-14 MFab (residues 469-592, 
connecting domain) and 18-44 MFab (residues 765-770, b-chain insert 
domain; exon 11 plus numbering, Prigent et al., 1990,/. Biol Chem. 265, 
9970-9977). 

Inherent in the model structure presented in Figure 20 is the 
25 implication that, with the two-fold axis aligned normal to the membrane 

surface, the mouth of the low-density cleft where insulin binding may occur 
would lie most distant from the transmembrane anchor, whilst the zone of 
intermediate density connecting the two high-density plates would be in 
close proximity to the membrane. It follows, in this model, that the Ll/cys- 
30 rich/L2 domains(Bajaj et al., 1997, Biochim. Biophys, Acta 916, 220-226; Ward 
et al.,1995, Proteins: Struct, Funct, Genet. 22, 141-153), which comprise 
much of the insulin-binding region (see Mynarcik et al., 1997, . /. Biol. Chem, 
272, 2077-2081). most probably lie in the membrane-distal upper halves of 
the two plates, whilst the membrane -proximal lower halves contain the 
35 connecting domains, the fibronectin-type domains, the insert domains and 
the interchain disulphide bonds (Schaffer and Ljungqvist, 1992, Bioc/iem. 
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Biophys. Res. Commun, 189, 650-653; Sparrow et al., 1997,/. Biol. Chem,, 272, 
29460-29467). Such a disposition of domains is supported by the images 
seen with the single MFab decoration, the 83-7 MFab epitope in the cys-rich 
region being spatially mapped roughly half-way up the side-edge of the 
ectodomain plates, and the 83-14 and 18-44 MFab epitopes (connecting 
domain and P-chain insert domain, respectively) being mapped near the base 
of the plates. Our preference is for a single a-btf monomer to occupy a single 
plate, although the possibility of a single monomer straddling the two plates 
of protein density cannot be discounted. 

The more complex images involving co-binding of two, and even 
more so of all three, MFabs to each monomer of the ectodomain dimer 
(Figures 22 and 23) are not easily interpretable with respect to relative 
domain arrangements within the monomer at present, not least of all because 
of the difficulty of finding conditions of negative staining that will 
simultaneously maintain the integrity of the Fab binding while highlighting 
recognisable and reproducible details of the internal structure of the dimeric 
IR ectodomain. 

The data presented here demonstrate the ability of single-molecule 
imaging to give an initial insight into the topology of multidomain structures 
such as the ectodomain of hIR, and the value of combining this technique 
with that of either single or multiple monoclonal Fab attachment per 
monomer as a potential means of epitope (and domain ) mapping of the 
structure. By imaging Fab complexes of other members of the family (such as 
hIGF-lR ectodomain) and combining available sequence-mapped epitope 
information with that presented here, a more comprehensive understanding 
of domain arrangements within the IR family ectodomains should be 
forthcoming. 

It will be appreciated by persons skilled in the art that numerous 
variations and/or modifications may be made to the invention as shovv^n in 
the specific embodiments without departing from the spirit or scope of the 
invention as broadly described. The present embodiments are, therefore, to 
be considered in all respects as illustrative and not restrictive 
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Figure 6 



(a) SDS PAGE 



(b) lEF pH3-7 
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Figure 10 



Figure 13: Sequence Alignment of hIGF-lR, hIR and hIRR ectodomains. 



Derived by use of the PileUp program in the software package of the Genetics Computer Group, 575 
Science Drive, Madison,, Wisconsin. USA. 

Symbol Comparison table: GenRunData: PileUpPep. Qnp Cbn^CheCk: 1254 

GapWeight: 3.0 
GapLengthWeight : 0.1 



Name: Higflr 
Name : Hir 
Name : Hirr 



Len: 972 CheCk: 1781 Weight: 1.00 
Len: 972 CheCk: 2986 Weight: 1.00 
Len: 972 CheCk: 9819 Weight: 1.00 



Higflr EICGP GIDIRNDYQQ LKRLEHSiyi EGYLHILLIS K. .AEDYRSY 43 

Hir HLYPGEVC.P GMDIRNMLIR LHELEH^VI EGHLQILLMF KTRPEDFRDL 49 

Hirr MNVC. P SLDIRSEVAE LRQLEHeSW EGHLQILLMF TATGEDFRGL 45 

Higflr RFPKLTVITE YLLLFRVAGL ESLGDLFP NL T VIRGWKLFY NYALVIFEMT 93 

Hir SFPKLIMITD YLLLFRVYGL ESLKDLFPHLJEVIRGSRLFF NYALVIFEMV 99 

Hirr SFPRLTQVTD YLLLFRVYGL ESLRDLFPNL AVIRGTRLFL GYALVIFEMP 95 

* 

Higflr NLKDIGLYNL RHITRGAIRI EKNADLCYLS TVDWSLILDA VSNNYIVGNK 143 

Hir KLKELGLYNL M^TTRGSVRI EKNNELCYLA TIDWSRILDS VEDNYIVLNK 149 

Hirr HLRDVALPAL GAVLRGAVRV EKNQELCHLS TIDWGLLQPA PGANHIVGNK 145 

« * * * * * * 

Higflr PPK.ECGDLC PGTMEEKPM. CEKTTINNEY NYRCWTTNRC QKMCPSTCGK 191 

Hir DDNEECGDIC PGTAKGKTN . CPATVINGQF VERCaTTHSHC QKVCPTICKS 19 8 

Hirr LG.EECADVC PGVLGAAGEP CAKTTFSGHT DYRCWTSSHC QRVCPCPHG . 193 

Higflr RACTEMNECC HPECLGSCSA PDNDIAC/AC RHYYYAGVCV PACPPNTYRF 2 41 

Hir HGCTAEGLCC HSECLG NCS O PDDPTKCVAC RNFYLDGRCV ETCPPPYYHF 24 8 

Hirr M.=.CrARGECC HTECLGGCSQ PEDPRACVAC RHLYFQGACL WACPPGTYQY 243 



Higflr EGWRCVDRD? CANILSAES . . . . SDSEGFV IHDGECMQEC PSGFI RNGS O 287 

Hir ODWRC\ /NFS ? CQDLHHKCKN SRRQGCHQYV IHNNKCIPEC PSGYT MNSS N 298 

Hirr E3WRCVTAER CASLHSVPG RASTFG IHQGSCLAQC PSGFT RNSS . 2 87 

* * * * * 

Higflr SMYdPCEG? CPKVCEEEKK TKTIDSVTSA QMLQGCTIFK GNLLINIRRG 33 7 

Hir .LLCTPCLGP CPKVCHLLEG EKTIDSVTSA QELRGCrVIM_S£LIINIRGG 347 

Hirr SIFCHKCEGL CPKECKV. .G TKTIDSIQAA QDLVGCTHVE GSLILNLRQG 335 

Higflr NNIASELENF MGLIEWTGY VKIRHSHALV SLSFLXNLRL ILGEEQLEGM 3 87 

Hir NMLAAELEAN LGLIEEISGY LKIRRSYALV SLSFFRKLRL IRGETLEIGN 3 97 

Hirr YMLEPQLQHS LGLVETITGF LKIKH3FALV SLGFFKNLKL IRGDAMVDGM 3 85 



Higflr XirYVLDNQN LQQLWDWDHR NLTIK-^GKMY F.^FNPKLCVS EIYRMEEVTG 437 

Hir v^rYALDNQM LRQLWDWSKH NLXlTCGKLr FHYNPKLCLS EIHKMEEVSG 447 

Hirr iZ-TvLDMQM LQQLGSWA\ GLTIPVGKIY FAFNPRLCLE HIYRLEEVTG 43 5 

* 1 End of 1-462 fragment 

Higflr TKGRQSKGDI .MTRMNGER.:^S CE3DV LHFTS TTTSKNRIII TWHRYRPPDY 487 

Hir TKGRQERNDI ALKTNGDQ.^lS CENEL LKrSY IRTSFDKILL RWEPYWPPDF 497 

Hirr TRGRQNKAEI NPRTNGDRAA CQTRT LRFVS NVT EADRILL RWERYEPLEA 485 



Higf Ir 
Hir 
Hirr 



RDLISFTVYY KEAPFKMVIE YDGQDACGSN SWNMVDVDLP PNKDV 532 

RDLLGFMLFY KEAPYQUYXE FDGQDACGSN SWTWDIDPP LRSNDPKSQN 547 
RDLLSFIVYY KESPFQHaiE HVGPDACGTQ SWNLLDVELP L SRTQ 530 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



Higflr 
Hir 
Hirr 



EPGILLHGLK PWTQYAVYVK AVTLTMVEND HIRGAKSEIL YTR TNASV PS 582 
HPGWLMRGLK PWTQYAIFVK TL.VTFSDER RTYGAKSDII YVQTDATNPS 596 
EPGVTLASLK PWTQYAVFVR AITLTTEEDS PHQGAQSPIV YLRTLPAAPT 580 



IPLDVLSASli_SSSQLIVKWN PPSLPNG NLS YYIVRWQRQP QDGYLYRHNY 632 

VPLDPISVSILaSSQIILKWK PPSDPN GNIT HYLVFWERQA EDSELFELDY 646 

VPQDVISTSH__aSSHLLVRWK PPTORNG NLT YYLVLWQRIA EDGDLYLNDY 630 

* ****** 

CSKD.KIPIR KYADGTIDIE EVTENPKTEV OGGEKGPCOV C. . . PKTEAE 678 

CLKGLKLPSR TWS . PPFESE DSQKHliQ£E. YEDSAGECCS C. . . PKTDSQ 691 

CHRGLRLPTS N.NDPRFDGE DGDPEAEME SDCCP CQHPPPGQVL 673 



Ot >< p 

KQAEKEEAEY RKVFENFLHN SIFVPRPERK RRDVMQVAmL-IMSSRSRUII 728 

ILKELEESSF RKTFEDYLHN WFVPRPSRK RRSLGDVGlDLJrVAVP. . .TV 738 

PPLEAQEASF QKKFENFLHN AITIPISPWK VTSIMKSPQR D.SGRHRRAA 722 

* 

AA..DTYJiIT DPEELETEYP FFESRVDNKE RTVISNLRPF TLYRIDIHSC 776 
AAFPmSSTS VPTSPEEHRP F . . EKWNKE SLVISGLRHF TGYRIELQAC 786 
GPLRLGG NSS DFEIQEDKVP RE RAVLSGLRHF TEYRIDIHAC 764 



NHEAEKLGCS ASNFVFARTM PAEGADDIPG PVTWEPRPEN SIFLKWPEPE 826 
NQDTPEERC3 VAAYVSARTM PEAKADDIVG PVTHEIFENN WHLMWQEPK 83 6 
NHAAHTVGCS AATFVFARTM PHREADGIPG KVAWEASSKN SVLLRWLEPP 814 



NPMGLILMYS IKYGS.QVED QRECV*SRQEY RKYGGAKLNR LNPGNIIARI 875 
EPMGLI^/LYE VSYRRYGDEE LHLCVSRKHF ALERGCRLRG LSPGM:£SVRI 8 86 
DFMGLILKYE IKYRRLGEEA TVLC/3RLRY AKFGGVHLAL LPPG NYS ARV 864 



Higflr 
Hir 
Hirr 



OATSLSG NGS WTDPVFFYVQ AKTGYENFIH L 
RATSLAG NGS '/TTEPTYFYVT DYLDVPSMIA K 
RATSLAGiISS WTDSVAFYIL GPEEEDAGGL H 



906 
917 
895 



Figure 14: Sequence AUgnment of EGFR, ErbB2, ErbB3 and ErbB4 

Ectodomains. 
[For alignment on the IGF- IR fragment sec Fig. 9) 

Derived by use of the PileUp program in the software package of the Genetics Computer Group, 575 
Science Drive, Madison, Wisconsin. USA. 

Symbol comparison table: GenRunData: Pileuppep.Cmp CorapCheck: 1254 

GapWeight: 3 .000 
GapLengthWeight : 0 . 100 

Name: Erb3 Len: 649 Check: 4625 Weight: 1.00 

Name: Erb4 Len: 649 Check: 790 Weight: 1.00 

Name: Egfr Len: 649 Check: 2381 Weight: 1.00 

Name: Erb2 Len: 649 Check: 8174 Weight: 1.00 



1 

Erb3 SEVGNSQAVC PGTLNGLSVT 
Erb4 . . . SDSQSVC AGTENKLSSL 
Egfr . - .LEEKKVC QGTSNKLTQL 
Erb2 STQVC TGTDMKLRLP 

51 

Erb3 HNADLSFLQW IREVTGYVLV 
Erb4 HNRDLSFLRS VREVTGYVLV 
Egfr RNYDLSFLKT IQEVAGYVLI 
Erb2 TNA3LSFLQD IQEVQGYVLI 

101 

Erb3 MLNYN TNSSHA 

Erb4 FLNYR KDGNFG 

Egfr LSNYD ANKT.G 

Erb2 LDNGDPLNNT TPVTGASPGG 

151 

Erb3 ID^vUDIVRDR . . .DAEIWK 
Erb4 IHWQDIVPJM? WPSNLTLVST 
Egfr IQVj-RDIVSSD "LSMMSMDFQ 
Erb2 ILWKDIFHKN NQLALTLIDT 

201 

Erb3 KTICAPQCNG KCFGPNPNQC 
Erb4 RTVCAEQCDG RCYGPYVSDC 
Egfr KIICAQQCSG RCRGKSPSDC 
Erb2 RTVCAGGC.A RCKGPLPTDC 

251 

Erb3 PRCPQPLVYN KLTFQLEPNP 
Erb4 TQCPQTFVYN PTTFQLEHNF 
Egfr DTCPPLMLYN PTTYQMDVNP 
Erb2 LKCPALVTYN TDTFESMPNP 

301 

Erb3 PPDKMSV.dk NGLKMCEPCG 

Erb4 PSSKMEV.EE NGIKMCKPCT 

Egfr GADSYEM.ee DGVRKCKKCE 

Erb2 PLHNQEVTAE DGTQRCEKCS 

351 

Erb3 NCTKILGNLD rllTGLNGDP 
Erb4 NCTKIMGNLI rLVTGIHGDP 
Egfr NCTSrSGDLH ILPVAFRGDS 
Erb2 GCKKIFG3LA FLPESFDGDP 

401 

Erb3 PPHMHNFSVF SN'LTTIGGRS 
Erb4 PPNMTDFSVF SNLVTIGGRV 



50 

GDAENQYQTL YKLYERCEW MGNLEIVLTG 
SDLEQQYRAL RKYYENCEW MGNLEITSIE 
GTFEDHFLSL QRMFNNCEW LGNLEITYVQ 
ASPETHLDML RHLYQGCQW QGNLELTYLP 

100 

AMNEFSTLPL PNLRWRGTQ VYDGKFAIFV 
ALNQFRYLPL ENLRIIRGTK LYEDRYALAI 
ALNTVERIPL ENLQIIRGNM YYENSYALAV 
AHNQVRQVPL QRLRIVRGTQ LFEDNYALAV 

150 

LRQLRLTQLT EILSGGVYIE KNDKLCHMDT 
LQELGLKNLT EILNGGVYVD QNKFLCYADT 
LKELPMRNLQ EILHGAVRFS NNPALCNVES 
LRELQLRSLT EILKGGVLIQ RNPQLCYQDT 

200 

DNGRSCPPCH EVC.KGRCWG PGSEDCQTLT 
NGSSGCGRCH KSC.TGRCWG PTENHCQTLT 
NHLGSCQKCD PSCPNGSCWG AGEENCQKLT 
NRSRACHPCS PMCKGSRCWG ESSEDCQSLT 

250 

CHDECAGGCS GPQDTDCFAC RHFNDSGACV 
CHRECAGGCS GPKDTDCFAC MNFNDSGACV 
CHNQCAAGCT GPRESDCLVC RKFRDEATCK 
CHEQCAAGCT GPKHSDCLAC LHFNHSGICE 

300 

HTKYQYGGVC VASCPHNFW .DQTSCVRAC 
NAKYTYGAFC VKKCPHNFW . DSSSCVRAC 
EGKYSrGATC VKKC PRNYW TDHGSCVRAC 
EGRYTFGASC VTACPYNYLS TDVGSCTLVC 

350 

GLCPKACEGT GSGSRF..QT VDSSNIDGFV 
DICPK-i^CDGI GTGSLMSAQT VDSSNIDKFI 
GPCRKVCNGI GIGEFKDSLS INATNIKHFK 
KPCARVCYGL GMEHLRSVRA VTSANIQEFA 

400 

DVHKIPALDPS KLlWcRTVRE ITGYLNIQSW 
YNAIEAIDPE KLNVFRTVRE ITGFLNIQSW 
FTHTPPLDPQ ELDILKTVKE ITGFLLIQAW 
ASMTAPLQPE QLQVFETLEE ITGYLYISAW 

450 

LYNRGFSLLI MKNLNVTSLG FRSLKEISAG 
LYS.GLSLLI LKQQGITSLQ FQSLKEISAG 



Egfr PENRTDLHAF ENLEIIRGRT KQHGQFSLAV VS . LNITSLG LRSLKEISDG 

Erb2 PDSLPDLSVF QNLQVIRGRI LHNGAYSL.T LQGLGISWLG LRSLRELGSG 

451 End L2 doinain> 500 

Erb3 RIYISANRQL CYHHSLNWTK VLRGPTEERt, DIKHNRPRRD CVA EGKVCDP 

Erb4 NIYITDNSNL CYYHTINWTT LF.STINQRI VIRDNRKAEN CTA EGMVCNH 

Egfr DVIISGNKNL CYANTINWKK LF.GTSGQKT KIISNRGENS CKA TGQVCHA 

Erb2 LALIHHNTHL CFVHTVPWDQ LFRNP.HQAL LHTANRPEDE CVG EGLACHQ 

^ 501 550 

Erb3 LCSSGGCWGP GPGQCLSCRN YSRGGVCVTH CNFLNGEPRE FAHEAECFSC 

Erb4 LCSSDGCWGP GPDQCLSCRR FSRGRICIES CNLYDGEFRE FENGS ICVEC 

Egfr LCSPEGCWGP EPRDCVSCRN VSRGRECVDK CKLLEGEPRE FVENSECXQC 

Erb2 LCARGHCWGP GPTQCVNCSQ FLRGQECVEE CRVLQGLPRE YVNARHCLPC 

551 600 

Erb3 HPECQPME.G TATCNGSGSD TCAQCAHFRD GPHCVSSCPH GVLGA.KGP. 

Erb4 DPQCEKMEDG LLTCHGPGPD NCTKCSHFKD GPNCVEKCPD GLQGA.NSF. 

Egfr HPECLPQAMN I.TCTGRGPD NCIQCAHYID GPHCVKTCPA GVMGENNTL. 

Erb2 HPECQPQN.G SVTCFGPEAD QCVACAHYKD PPFCVARCPS GVKPDLSYMP 

601 649 

Erb3 lYKYPDVQNE CRPCHENCTQ GCKGPELQDC L GQT 

Erb4 IFKYADPDRE CHPCHPNCTQ GCNGPTSHDC lYYPWTGHST LPQHARTPL 

Egfr VWKYADAGHV CHLCHPNCTY GCTGPGLEGC PTNGPKIPS 

Erb2 IWKFPDEEGA CQPCPINCTH SCVDLDDKGC PAEQRASPLT S, , . . 



I 



Figure 15. Classirication of Cys-rich modules 
C2-4 denote modules with the 1-3/2-4 double disulphide bond connections. 
CI -2 for the single disulphide bonded modules and 
CU2t for stabilised beta turn. 
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Figure 16 



